Discussions¶
So PyPI tokens are based on Macaroons.
What is a macaroon, how does it work, how is it secure?¶
At heart, a macaroon is an introspectable token, it lets its bearer be identified. There’s an identifier, which is readable in the token, and a secret key which is not included in the macaroon. A HMAC hash of the identifier and the secret key is also present in the token. This results in the following properties:
Given a macaroon, you can extract the identifier easily
If you know the identifier and the key, you can check that the macaroon has not been modified
Given just the macaroon, you cannot extract the key. The key is the secret part that lets you generate and check a macaroon’s authenticity.
But then the power of the macaroon comes from the following: you can add caveats to
the macaroon. A caveat is a readable string. When you add a caveat, you take the
previous HMAC
, and use it as a key for a new HMAC of the caveat. This gives you
the following properties:
Given a macaroon, you can still extract the identifier and the caveat strings
Given a macaroon and its key, you can check each step by computing the same
HMAC
on your side and check that your hash is still consistentGiven a macaroon, anyone can add a caveat
Given a macaroon without the key, it’s impossible to remove a caveat. Intermediate hashes are discarded each time you add a caveat, so the only way is forward (adding more caveats).
The specification doesn’t say what the caveat strings should be, but the idea is that each caveat is adding a restriction to the original macaroon, so that when the bearer has to send the macaroon, they only delegate the smallest fraction of their power.
It’s up to the Macaroon original minter to decide how to understand the caveats.
How are macaroons implemented in this library?¶
TL;DR: they’re not :D This library is a abstraction over PyMacaroons, but with a specialization on the needs of PyPI. Here are a few things this library does:
Distinguish between
tokens
andmacaroons
. PyPI tokens have apypi-
prefix. A token is a serialized macaroon with a prefix.Rename caveats “restrictions”. The word caveat is nice and latin and all, but as far as the author can tell, in international contexts, people tend not to understand immediately what “caveat” mean in this context. Given the idea of this lib is to ease the use of PyPI tokens, where those caveats are mainly used as restrictions, it felt more logical to name it this way.
How are macaroons implemented in PyPI?¶
PyPI keeps a database table of macaroons, with identifiers, keys, and related users. When a macaroon bearer uploads a release using their macaroon, the identifier is read, then from the database, the key and user are extracted. The macaroon is checked using the key and caveats are checked against the upload information. If the macaroon is valid, then PyPI checks if the user has upload rights on the package, and then proceeds.
The caveats are json-encoded strings, and as of October 2022, they come in 7 flavors:
4 new caveats and 3 legacy caveats.
The legacy caveat are represented by classes prefixed by Legacy
.
The types of caveats are:
[0, <exp: int>, <nbf: int>]
is met if we try uploading the project between timestampsnbf
(included) andexp
(excluded). It’s represented by the classDateRestriction
. Legacy format is"{"nbf": <timestamp: int>, "exp": <timestamp: int>}"
and the corresponding class isLegacyDateRestriction
.[1, [<project_name: str>, ...]]
is met if the project we upload is among the ones listed in the caveats. It’s represented by the classProjectNamesRestriction
. Legacy format is{"version": 1, "permissions": {"projects": [<project_name: str>, ...]}}
and the corresponding class isLegacyProjectNamesRestriction
.[2, [<project_id: str>, ...]]
is met if the project we upload is among the ones listed in the caveats. It’s represented by the classProjectIDsRestriction
. There is no legacy equivalent.[3, <user_id: str>]
is met if the user triggering the upload is the one associated whit that UUID. It’s represented by the classUserIDRestriction
{"version": 1, "permissions": "user"}
which is always met. It’s represented in this library by the classLegacyNoopRestriction
.
Within the PyPI website as of October 2022, one may generate tokens. Those tokens are either associated with a single project or with the how account.
If they are associated to a single project, they will come with a
ProjectNamesRestriction
with a single value (the normalized name of the project)
and a ProjectIDsRestriction
with a single value (the ID if the project).
If they are not associated with a project, they will come with a UserIDRestriction
associated with the user who generated the token.
If they were generated before August 2022, they may come with legacy restrictions.
There is currently no way in the website’s interface to generate tokens with
a DateRestriction
or associated with multiple projects, but if such a token
was received, PyPI would interpret it correctly.
Note
You still need to be a project maintainer or owner for your tokens to work on a project. Restrictions can only reduce the scope of a token.
Do we really need an abstraction layer over PyMacaroons?¶
Yes ? No ? Maybe ? What’s clear is that the community would benefit from a single source of macaroons, so that the same codebase would be responsible for serializing and deserializing the custom PyPI token restrictions.
Can we add new restrictions?¶
As long as PyPI doesn’t use pypitoken
to generate tokens, it’s not very
useful to implement new restrictions here that would not already be supported
by PyPI. We’re doing our best to have this library follow developments of PyPI
itself, and provide feature-parity.
In discussions around PyPI development, the following restrictions that have been mentionned:
Version-based restriction
Filename-based restriction
Hash-sum-based restriction
IP-based restriction
One-time-use restriction (this will require Warehouse to remember a value)
Somehow restricting to uploads coming from a given project’s CI
Most of those were initially discussed in the Warehouse tracker.
There would be 2 main categories of restrictions:
Restrictions you apply just before uploading a release to limit the possible consequences of token steal or replay attack,
Restrictions you apply before handing your token over to a third party, to ensure they can’t mis-use it.
Is this library a part of PyPI?¶
It’s being developper externally. The initiator of this project is a member of the Python Packaging Authority (PyPA) and PyPI moderator, but not an admin nor a committer.
There was an offer for this library to be adopted by PyPA, but it didn’t gain any traction
Why is there a noop restriction?¶
Good question. The author is not sure either. In the original discussions in Warehouse,
the idea was to have 2 types of tokens: “user” tokens and “projects” tokens. But even
without restrictions, tokens are already scoped to a specific user, so adding a “user”
restriction actually changes nothing, thus why it’s implemented in pypitoken
as a
LegacyNoopRestriction
.
Tokens without restrictions work the same as tokens with a noop restriction (or, for what it’s worth, token with multiple noop restrictions).
Note that when the restrictions were re-worked in PyPI in Summer 2022, the “user” caveat was actually associated with a check that the request was originated by the corresponding user. This is mainly relevant around OpenID Connect use-cases.
What does “normalized name” mean?¶
Throughout the doc, the term “normalized name” for a project is regularly used. This is because some characters are synonymous in a project name, so in order to match a project name, we need to put it to canonical form first.
See PEP 503 for all the details.
What would be good practice regarding token restrictions and traceability¶
PyPI offers quite a bit of interesting features regarding token traceability & audit:
You can list your existing tokens, including a description of your choice
You can revoke them
You can see the restrictions applied at generation time by PyPI
Other project admins can see if you generated tokens for projects you share with them
It’s generally considered a good idea to use each token for one dedicated usage, so that if you need to revoke a token, you don’t break anything else.
Adding restrictions yourself on existing tokens have consequences on those elements:
If two tokens are created by adding restrictions to a single “parent” token, revoking the parent token will revoke all the children at the same time. Given it’s still a good idea to use a token for one usage only, when you generate a token with a restriction, if you plan to store the child token, then you should consider throwing away immediately the original token. Of course, this is not always applicable, some use-cases may require to store both but you may need to track the diffusion of your tokens yourself.
In PyPI, restrictions you added yourself will not appear in the token list, so it’s a good idea to be overly explicit in the token description. Note that the description field cannot be modified after generation.
This way, your PyPI account page will still be a good place to track all of your existing tokens, and you will be able to follow each of them easily.
All this talking about Macaroons, I’m hungry now!¶
Here’s a recipe for Montmorillon Macaroons, as done in the author’s region. Note that it’s using SI units.
Ingredients:
Sugar: 125g
Ground almonds: 150g
Egg white: 70g (that’s 2 eggs)
Bitter almond (few drops)
Steps:
Preheat oven to 50°C.
Spread the ground almonds on a baking sheet, put in oven for 10 to 15 minutes.
Remove it from oven, let it cool and mix with sugar.
Whip the egg whites stiff and add a few drops of bitter almond.
Using a rubber spatula, fold the egg whites into the sugar & almond batter.
Pour the batter into a piping bag with a ribbed nozzle.
Form the macaroons on baking paper and leave them to rest for 2h at ambient temperature.
Preheat oven to 190°C.
Lower the oven to 180°C, and put the macaroons in for 3 minutes, then 15 minutes at 160°C.
Allow to cool before yummy time.
Nice logo! Where did you get it?¶
Design is “macaroon” by Izwar Muis from the Noun Project. Colors are taken from Python’s visual identity.