Libraries

ADS Libraries can be used for creating and curating collections of bibliographic records. Libraries can be created or edited through the ADS website, or through the ADS API. This page provides a guide on how to create and manipulate libraries through the ads package.

You will need the following imports in order to execute all code blocks on this page:

import ads
from datetime import datetime

The ads.Library object

The ads.Library data model represents a library managed by ADS. If you have the right access for a given library, then these fields are editable:

ads.Library.name

Given name to the library.

ads.Library.description

Description of the library.

ads.Library.public

Whether the library is public.

ads.Library.owner

The ADS username of the owner of the library.

The following fields are read-only, regardless of your access:

ads.Library.id

Unique identifier for this library, which is assigned by ADS.

ads.Library.num_users

Number of users of the library.

ads.Library.num_documents

Number of documents in the library.

ads.Library.date_created

Date (UTC) the library was created.

ads.Library.date_last_modified

Date (UTC) the library was last modified.

These fields are discussed in more detail below.

Retrieve existing libraries

If you have existing libraries in your ADS account you can immediately select these using the ads package. If you want to retrieve a single library you can use ads.Library.get(), and if you don’t supply any keyword arguments to this function then you will get the first library that is returned by ADS.

Below are a few examples showing how to retrieve libraries from ADS, with various degrees of complexity.

# Retrieve a single library based on an exact query expression.
library = ads.Library.get(name="SDSS-IV")

# Get a single library, but we don't care which one.
library = ads.Library.get()

The ads.Library.get() method is for retrieving a single library. If you want to retrieve all your libraries, or your libraries based on some query expression, then you can use the ads.Library.select() method:

# Retrieve all my current libraries
libraries = ads.Library.select()

# Retrieve libraries based on a more complex query expression
libraries = ads.Library.select().where(
    ads.Library.description.like("Gaia") & 
    (
        (ads.Library.num_documents > 100) |
        (ads.Library.date_last_modified >= datetime(2021, 12, 1))
    )
)

In most cases you probably want to iterate over it to retrieve your libraries:

# Iterate through all my libraries.
for library in libraries:
    print(f"{library.id} {library.name}: {library.description}")

# After you've iterated through the libraries, you can select libraries by their index,
# or iterate over them again without any more API calls made to ADS.
last_library = libraries[-1]

small_libraries = [lib for lib in libraries if lib.num_documents < 5]

But you can also apply further operations to your query object, like limit, sort, or filter:

top_5_libraries = ads.Library.select()\
                             .sort(ads.Library.num_documents.desc())\
                             .limit(5)
for library in ads.Library.select():
    print(f"{library.id}: {library.name} has {library.num_documents}")

In these example, each library is an ads.Library object that stores the metadata, documents, and permissions about that library. The ads.Library object also has a number of method functions that can be used to manipulate the library.

Creating a new library

You can create a new library locally and add or remove documents, or perform set operations with other libraries. When you’re finished, you can save the library to your ADS account using ads.Library.save() function.

library = ads.Library(name="Example")

# Perform some operations

# Save the changes to the ADS library
library.save()

Saving your changes to ADS

Any time you make changes to a library, these won’t be automatically updated to ADS until you use the ads.Library.save() function. Examples of changes that you will need to save include:

  • Creating a library

  • Adding or removing documents, including emptying all documents from a library

  • Updating metadata (e.g., name, description)

  • Updating permissions

It’s okay to save() if you don’t know whether you need to save or not. If there are no changes that need to be updated, then nothing will happen.

Warning

You will not receive any warning if your Python script finishes or crashes before you call ads.Library.save().

Managing documents

Accessing documents

If you want to access the documents in a library then you can simply iterate over them:

for document in library:
    print(f"{document.bibcode} {document.journal} {document.title}")

You can also access documents by an index (e.g., library[4]) or slicing (e.g., library[4:10]), but this is not recommended because no explicit sort can be given to ADS when we are retrieving the documents in a library.

Adding documents

When it comes to adding or removing documents, the ads.Library object behaves a bit like a set or list. You can use ads.Library.append() or ads.Library.extend() to add documents, or use the addition operator += in Python:

library = ads.Library.get(name="Example")

documents = [
    ads.Document.get(bibcode="2000A&AS..143...41K"),
    ads.Document.get(bibcode="1991ASSL..171..139W")
]

# Add the documents to the library.
# There are three ways you can do this. All produce the same result.
# 1. Use the addition operator.
library += documents

# 2. Or use the .extend function for a list of documents.
library.extend(documents)

# 3. Or use the .append function for an individual document.
for document in documents:
    library.append(document)

# You will need to save your library to have the changes reflected on ADS.
library.save()

If the document is already in the library then it won’t be duplicated. In this way the ads.Library object behaves like a set, but here you can use addition and subtraction operators (+= and -=), which is unlike a set and more like a list.

Removing documents

If you want to remove a document from a libary then you can use the ads.Library.remove() or ads.Library.pop() methods, or just use the subtraction operator:

# To remove a single document
library.remove(documents[-1])

# Remove a document based on its index
library.pop(0) 

# Empty the library of all documents
library.empty()

# You will need to save your library to have the changes reflected on ADS.
library.save()

Metadata

Each library has associated metadata.

These fields are read-only:

These metadata fields can be edited by the owner or by an administrator of the library:

You can access all of these fields as attributes of the ads.Library class. For example:

# Retrieve any single library.
library = ads.Library.get()

print(
f"""
{library.id} {library.name} has:
 - {library.num_users} users 
 - {library.num_documents} documents
The library is owned by {library.owner} and is {'public' if library.public else 'private'}.
The library description is: {library.description}
"""
)

If you want to change the name, description, or public field of a library then you can directly edit the attribute of the ads.Library object, and then save your changes. An exception will be raised if you try to edit any of the read-only metadata fields.

# Update the library metadata.
library.name = "New name"
library.description = "A new description"
library.public = not library.public

# Save the changes to ADS.
library.save()

Changing the library.owner property will transfer the ownership of the library to another user the next time the library is saved. See transfer ownership of a library.

Permissions

You can give specific permissions for other ADS users to be able to read, write, or administer your library. You can view the permissions for a library with the ads.Library.permissions attribute, which is a ads.models.library.Permissions object, but you can treat it like a Python dict with with email addresses as keys, and a list of permissions as values. For example:

import json
print(json.dumps(library.permissions, indent=2))
{
  "andrew.casey@monash.edu": [
    "owner"
  ],
  "ada.lovelace@gmail.com": [
    "read", "write", "admin"
  ]
}

If you want to modify the permissions of a library then you can directly edit the ads.Library.permissions attribute, and save the library. For example:

# An exception will be raised if bob@gmail.com does not have an ADS account.
library.permissions["bob@gmail.com"] = ["read"]
library.save()

The valid permission keys you can assign to a user are read, write, and admin. You can see that owner is also a permission key, but you cannot change the owner by editing ads.Library.permissions. To do that you need to transfer ownership of a library.

Set operations

The ADS API allows for set operations that you can perform with two or more libraries. These include union, intersection, difference, copy, and empty.

You could perform these operations locally, only using the ADS API to save your final library, and in the ‘Local’ tabs of the code examples below you will see how this is done. However, using the remote ADS endpoints for some of these operations means that ADS will return a new library for you, which can be convenient.

Union

The union of a collection of sets is the set of all elements in the collection.

library_a = ads.Library.get(id="<public library id A>")
library_b = ads.Library.get(id="<public library id B>")
library_c = ads.Library.get(id="<public library id C>")

# Create a new library that is the union of these libraries.
library_union_abc = library_a.union(library_b, library_c)
library_a = ads.Library.get(id="<public library id A>")
library_b = ads.Library.get(id="<public library id B>")
library_c = ads.Library.get(id="<public library id C>")

# Create a set of documents that are contained in any of these libraries.
documents_union_abc = set().union(library_a, library_b, library_c)

# Create a new library.
library_union_abc = Library(documents=documents_union_abc)
library_union_abc.save()

Intersection

The intersection of two sets A and B is the set containing all elements of A that also belong to B.

library_a = ads.Library.get(id="<public library id A>")
library_b = ads.Library.get(id="<public library id B>")

# Create a new library that is the intersection of A and B.
library_ab = library_a.intersection(library_b)
library_a = ads.Library.get(id="<public library id A>")
library_b = ads.Library.get(id="<public library id B>")

# Create a new library that is the intersection of A and B.
documents = set(library_a).intersection(library_b)
library_ab = ads.Library(documents=documents)
library_ab.save()

Difference

The difference of A and B is the set of elements in A but not in B. This is not always a symmetric operation: the difference of B and A is the set of elements in B but not in A.

library_a = ads.Library.get(id="<public library id A>")
library_b = ads.Library.get(id="<public library id B>")

# Create a new library that is the difference of A and B.
library_ab = library_a.difference(library_b)

# Create a new library that is the difference of B and A.
library_ba = library_b.difference(library_a)
library_a = ads.Library.get(id="<public library id A>")
library_b = ads.Library.get(id="<public library id B>")

# Create a new library that is the difference of A and B.
documents_ab = set(library_a).difference(library_b)
library_ab = ads.Library(documents=documents_ab)
library_ab.save()

# Create a new library that is the difference of B and A.
documents_ba = set(library_b).difference(library_a)
library_ba = ads.Library(documents=documents_ba)
library_ba.save()

Copy

This operation will copy all the documents from library A to library B.

library_a = ads.Library.get(id="<public library id A>")
library_b = ads.Library.get(id="<public library id B>")

# Copy the documents from library A to library B.
library_a.copy(library_b)
library_a = ads.Library.get(id="<public library id A>")
library_b = ads.Library.get(id="<public library id B>")

# Copy the documents from library A to library B
library_b += library_a
library_b.save()

Empty

This will empty a library of all its documents. The library itself will still exist, but it will contain no documents. If you want to delete the library (and all its documents), use the ads.Library.delete() function.

library = ads.Library.get(id="<public library id A>")

# Empty the library of all its documents using the .empty() function.
library.empty()
library = ads.Library.get(id="<public library id A>")

# Empty the library by setting the .documents attribute to be an empty list.
library.documents = []

# Unlike the .empty() method, we need to save our changes.
library.save()

Transfer ownership of a library

You can transfer the ownership of your library to another ADS user. The owner of the library is given by the ads.Library.owner property. To transfer ownership you can simply change the value of this attribute, and save the library.

Note

The ads.Library.owner attribute is a little inconsistent. For libraries that you have read (or higher) access, the ads.Library.owner property returns the ADS username of the account that owns the library. However, if you want to transfer the ownership of a library to another user, you need to set the .owner attribute to be the email address that the new owner uses for their ADS account.

After the transfer has occurred, if the new owner were to retrieve the Library then they would see their ADS username in this field, even though you needed to provide their email address to make the transfer happen. If you supply an invalid email address, or an email address that is not associated with any ADS account, then an exception will be raised.

Let’s create a new library and transfer the ownership to another user:

# Create a new library.
library = ads.Library()

# Add a document.
library += ads.Document.get(bibcode="2000A&AS..143...41K")

# Transfer the ownership to another ADS user.
library.owner = "ada.lovelace@gmail.com"

# Everything we have done so far has been performed locally.
# We will need to save this library to push the changes to ADS.
library.save()

# If there is an ADS account associated with the email address above, 
# then the transfer will be successful, and any operations we want to
# make on this library will now be forbidden by ADS!
# (See warning below)
library.refresh()

Warning

Once you transfer ownership of a library to another user you will immediately lose all read and write access to that library.

The moment that someone else owns the library, you cannot give yourself read, write, or admin permissions. And if you own the library, then you cannot edit your own permissions. That means if you want to transfer ownership of a library to another user and keep some permissions (e.g., read-only), you have to ask the new owner to update the library permissions.

Delete a library

Using ads.Library.empty() will remove all documents from a library, but the library itself will still exist with zero documents. If you want to delete a library from ADS, you can use ads.Library.delete():

# Create a temporary library.
library = ads.Library.create(name="Temporary library")

# Delete it!
library.delete()

The library object will still exist in your Python script, but any further modifications you make to the libary will result in an error, because ADS has deleted the library from their server.

Search for documents in a library

You can combine searches for documents in libraries without much user effort. (Instead, the ads package is doing the work for you.) If you wanted to search among documents in a library for those published in 2020, here’s what it might look like:

library = ads.Library.get()
documents = ads.Document.select()\
                        .where(ads.Document.in_(library) \
                            & (ads.Document.year == 2020)
                        )

That kind of query is so simple that you could do the same thing locally:

documents = list(filter(lambda doc: doc.year == 2020, library))

But for a query with ADS fields or operators that are searchable but not viewable, you can use the ads.Document and ads.Library object relational mappers to execute them. Here are some examples:

# Find documents in this library that are trending in exoplanets.
trending_exo_docs = ads.Document.select()\
                                .where(
                                    ads.Document.trending("exoplanets") \
                                    &   ads.Document.in_(library)
                                )\
                                .order_by(
                                    ads.Document.read_count.desc()
                                )

# Match for some keyword in the virtual `all` field, which checks:
# author_norm, alternate_title, bibcode, doi, identifier.
jwst_docs = ads.Document.select()\
                        .where(
                            ads.Document.all.like("JWST") \
                            &   ads.Document.in_(library)
                        )

# Find recent documents that match some keywords but are not in a library.
gaia_library = ads.Library.get(name="Gaia EDR3 papers")
gaia_docs = ads.Document.select()\
                        .where(
                            ads.Document.abs.like("Gaia") \
                            &   (not ads.Document.in_(gaia_library)) \
                            &   (ads.Document.date >= gaia_library.last_modified_date)
                        )

How does this work?

Most users don’t need to know how this works. But if you’re interested, read on.

The expressions given in the .where() clause are parsed by the ads.models.document.DocumentSelect object into a search string that ADS can understand. Most search requests to the ADS API use the /search/query endpoint. But there are limitations on this endpoint. For example, if we wanted to search for documents that match the “JWST” phrase and are also in some library, then we have to construct an ADS search string like all:JWST AND bibcode:(A OR B OR C OR ...), where A, B, C, etc are bibcodes of documents that are in the library. Making an ADS search with a term like bibcode:(A OR B OR C OR ...) is prohibitively expensive, and the standard /search/query endpoint will raise an exception if the search is going to be too expensive.

Instead, if the expression in .where() includes a many-comparison restriction on ads.Document.bibcode then the ads.models.document.DocumentSelect will use the /search/bigquery ADS API endpoint, which allows for efficient searching given a list of many bibcodes. This endpoint has different parameters, restrictions, and rate limits than the standard endpoint, but the ads package manages all of this for you. Hopefully, you should never even know which endpoint was used.