Skip to content

Commit b834bdd

Browse files
committed
initial draft of improved api tokens
1 parent 06893a5 commit b834bdd

File tree

1 file changed

+162
-0
lines changed

1 file changed

+162
-0
lines changed

text/XXXX-improved-api-tokens.md

Lines changed: 162 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,162 @@
1+
- Start Date: 2022-10-27
2+
- RFC Type: feature
3+
- RFC PR: <link>
4+
- RFC Status: draft
5+
6+
---
7+
8+
**Table of Contents**
9+
10+
- [Summary](#summary)
11+
- [Motivation](#motivation)
12+
- [Why?](#why-)
13+
- [Expected Outcome](#expected-outcome)
14+
- [User Experience](#user-experience)
15+
- [Token Format](#token-format)
16+
- [Backend Token Hygiene](#backend-token-hygiene)
17+
- [Secret Verification Service](#secret-verification-service)
18+
- [Background](#background)
19+
- [Options Considered](#options-considered)
20+
- [Option #1](#option--1)
21+
- [`n+1`](#-n-1-)
22+
- [`n+2`](#-n-2-)
23+
- [Option #2](#option--2)
24+
- [Option #3](#option--3)
25+
- [Drawbacks](#drawbacks)
26+
- [Option #2](#option--2-1)
27+
- [Option #3](#option--3-1)
28+
- [Unresolved questions](#unresolved-questions)
29+
30+
---
31+
32+
# Summary
33+
34+
Improve on Sentry's API token implementation to a more secure pattern. We will have three major goals:
35+
36+
1. Using hashed values in the database
37+
2. Only display the token once to the end user upon creation
38+
3. Allow users to _name_ the tokens ([#9600](https://github.com/getsentry/sentry/issues/9600))
39+
4. Use a predictable prefix to integrate with various secret scanning services (ex. Github's Secret Scanning)
40+
41+
# Motivation
42+
43+
## Why?
44+
45+
Sentry currently provides several strong options to secure a user's account, including SSO, SAML, and 2FA options. However our approach to handling API tokens is not as mature. We have two major issues with the current API token implementation:
46+
47+
1. Tokens are stored as plaintext in the database, increasing the risk of exposure
48+
2. Tokens are visible as plaintext in the Sentry UI
49+
50+
As a result, Sentry has to take extra steps to ensure the confidentially of API tokens (for example, [PR #39254](https://github.com/getsentry/sentry/pull/39254)).
51+
52+
## Expected Outcome
53+
54+
### User Experience
55+
56+
When a user creates an API token, they **have the option to provide it an arbitrary name** and **the actual token value is only displayed to them once**.
57+
58+
Existing API tokens, should be hashed and their original value discarded.
59+
60+
A notice in the Sentry UI should be presented to suggest the user rotate and generate a new token to the new and improved version.
61+
62+
### Token Format
63+
64+
We need a predictable token format in order to integrate properly with secret scanning services. Our current format is a 64 character alphanumeric string. This is insufficient and would likely produce a high amount of false positives in tooling like [TruffleHog](https://github.com/trufflesecurity/trufflehog), [detect-secrets](https://github.com/Yelp/detect-secrets), [Github's Secret Scanning](https://docs.github.com/en/code-security/secret-scanning/about-secret-scanning), etc.
65+
66+
The suggested pattern is `snty[a-zA-Z]_[a-zA-Z0-9]{64}`. The character _after_ `snty` will be used to identify the token type.
67+
68+
For example:
69+
70+
- `sntyu_a1b2c3...` - identifies a **user** API token
71+
- `sntya_a1b2c3...` - identifies an **API Application** token
72+
73+
### Backend Token Hygiene
74+
75+
API tokens should be stored in the database with a cryptographically secure hash (minimum SHA-256). Salting is not necessary since these values are sufficient in length.
76+
77+
An additional column identifying the API token version will be added in order to correctly identify legacy API tokens.
78+
79+
### Secret Verification Service
80+
81+
In order to integrate with Github's Secret Scanning service, we will need to have an available service for them to submit potential matches to.
82+
83+
More details about Github's secret scanning program can be found [here](https://docs.github.com/en/developers/overview/secret-scanning-partner-program).
84+
85+
At a high-level, we will need to be able to:
86+
87+
- Receive a list of one or more potential tokens
88+
- [Verify the signature of the request](https://docs.github.com/en/developers/overview/secret-scanning-partner-program#implement-signature-verification-in-your-secret-alert-service)
89+
- Determine if the token is a true positive or a false positive
90+
- send this feedback back to Github
91+
- Disable/revoke the compromised token
92+
- Send a notification to the appropriate user
93+
94+
> :memo: This secret verification service is not an immediate need or a dependency of implementing the improved API tokens.
95+
96+
# Background
97+
98+
Accidental credential leaks [happen](https://github.com/ChALkeR/notes/blob/master/Do-not-underestimate-credentials-leaks.md). Even though we provide several tools for users to limit the storage of sensitive information in Sentry it can still happen.
99+
100+
Our support for various authentication methods (including 2FA, SSO, and SAML) helps mitigate access of potentially sensitive information, but API tokens cannot support these additional authentication gates.
101+
102+
We can help customers further protect their accounts and data by providing a means of auto-detecting leaked API tokens.
103+
104+
# Options Considered
105+
106+
## Option #1
107+
108+
The migration from the legacy tokens to the new tokens should be rolled out over X releases. This allows for a smooth transition of self-hosted instances keeping pace with the versions.
109+
110+
> With `n` being the current version.
111+
112+
### `n+1`
113+
114+
- Add a Django migration including the `hashed_token`, `name`, and `token_version` fields.
115+
- columns should allow `null` values for now.
116+
- Remove displaying the plaintext token in the frontend, and only display newly created tokens once.
117+
- Implement new logic to generate the new token pattern.
118+
- The `token_version` should be set as `2`.
119+
- Implement a backwards compatible means of verifying legacy tokens. This should:
120+
- Verify the legacy token is valid/invalid.
121+
- If valid, hash the legacy token and store it in the table.
122+
- Remove the plaintext representation of the token in the table.
123+
- Mark the `token_version` as `1`.
124+
- A notification/warning in the UI should be displayed recommending users recreate their tokens, resulting in the new token version.
125+
126+
The `n+1` Sentry version should run in production for sufficient time to update the majority of the rows in the database as the legacy tokens are _used_ by users.
127+
128+
### `n+2`
129+
130+
- Add a Django migration to generate the `hashed_token` value for any tokens in `sentry_apitoken` that do not already have a hashed value.
131+
- Rows matching this are legacy tokens that were not used during the transition period.
132+
- The `token_version` should be set to `1` for these rows.
133+
134+
## Option #2
135+
136+
Instead of slowly generating the hashed token values over time of the legacy tokens (as in Option #1), we could generate a single migration that migrates the entire table to the end state.
137+
138+
## Option #3
139+
140+
To avoid the two different token versions, we could automatically prepend the prefix `sntyx_` (with `x` just being a placeholder here). We would then follow a similar approach to Option #1 or Option #2 to generate the hashed values.
141+
142+
# Drawbacks
143+
144+
## Option #2
145+
146+
- This is less than ideal because of the potential performance impact while the migration mutates each row in the table. Potential impacts to self-hosted installs would be unknown as well.
147+
148+
## Option #3
149+
150+
- Users would not be able to benefit from the Github Secrets Scanning since they would still be using their 64 character alphanumeric string without the prefix.
151+
- Authentication views/logic would become more complex with branching code to handle the with and without prefix cases.
152+
153+
# Unresolved questions
154+
155+
- Implementation path for `APIApplication` secrets.
156+
- Is there a downside to allowing `null` in the additional columns?
157+
- What is the median time between token use?
158+
- _This value could be used to inform how long we wait between versions for the migration that will edit pending rows in the database._
159+
- Do we want to support [API Keys](https://docs.sentry.io/api/auth/#api-keys) as well, even though they are considered a legacy means of authentication?
160+
- Are there additional secret types we would like to apply this to?
161+
- Should the secret verification service live within the monolith or run as a separate service entirely?
162+
- _Github [mentions that there could be large batches of secrets sent for verification](https://docs.github.com/en/developers/overview/secret-scanning-partner-program#create-a-secret-alert-service:~:text=Your%20endpoint%20should%20be%20able%20to%20handle%20requests%20with%20a%20large%20number%20of%20matches%20without%20timing%20out.). It might make sense to keep this as a separate service to avoid impact._

0 commit comments

Comments
 (0)