Skip to content

Commit bb528e9

Browse files
authored
Fix typos (#878)
1 parent 067067b commit bb528e9

6 files changed

Lines changed: 13 additions & 13 deletions

File tree

docs/comparisons.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
There are many tools and frameworks in the data ecosystem. This page tries to make sense of it all.
66

77
## dbt
8-
[dbt](https://www.getdbt.com/) is a tool for data transformations. It is a pioneer in this space and has shown how valuable transformation frameworks can be. Although dbt is a fanstastic tool, it has trouble scaling with data and organizational size.
8+
[dbt](https://www.getdbt.com/) is a tool for data transformations. It is a pioneer in this space and has shown how valuable transformation frameworks can be. Although dbt is a fantastic tool, it has trouble scaling with data and organizational size.
99

1010
dbt built their product focused on simple data transformations. By default, it fully refreshes data warehouses by executing templated SQL in the correct order.
1111

@@ -107,7 +107,7 @@ WHERE d.ds BETWEEN @start_ds AND @end_ds
107107
#### Data leakage
108108
dbt does not check whether the data inserted into an incremental table should be there or not. This can lead to problems and consistency issues, such as late-arriving data overriding past partitions. These problems are called "data leakage."
109109

110-
SQLMesh wraps all queries in a subquery with a time filter under the hood to enforce that the data inserted for a particular batch is as expected and reproducible everytime.
110+
SQLMesh wraps all queries in a subquery with a time filter under the hood to enforce that the data inserted for a particular batch is as expected and reproducible every time.
111111

112112
In addition, dbt only supports the 'insert/overwrite' incremental load pattern for systems that natively support it. SQLMesh enables 'insert/overwrite' on any system, because it is the most robust approach to incremental loading, while 'Append' pipelines risk data inaccuracy in the variety of scenarios where your pipelines may run more than once for a given date.
113113

docs/concepts/models/overview.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ The `SELECT` expression of a model must follow certain conventions for SQLMesh t
3333
### Unique column names
3434
The final `SELECT` of a model's query must contain unique column names.
3535

36-
### Explict types
36+
### Explicit types
3737
SQLMesh encourages explicit type casting in the final `SELECT` of a model's query. It is considered a best practice to prevent unexpected types in the schema of a model's table.
3838

3939
SQLMesh uses the postgres `x::int` syntax for casting; the casts are automatically transpiled to the appropriate format for the execution engine.
@@ -55,11 +55,11 @@ This example demonstrates non-inferrable, inferrable, and explicit aliases:
5555
```sql linenums="1"
5656
SELECT
5757
1, -- not inferrable
58-
x + 1, -- not infererrable
59-
SUM(x), -- not infererrable
58+
x + 1, -- not inferrable
59+
SUM(x), -- not inferrable
6060
x, -- inferrable as x
6161
x::int, -- inferrable as x
62-
x + 1 AS x, -- explictly x
62+
x + 1 AS x, -- explicitly x
6363
SUM(x) as x, -- explicitly x
6464
```
6565

@@ -87,7 +87,7 @@ Name is ***required*** and must be ***unique***.
8787
- Start is used to determine the earliest time needed to process the model. It can be an absolute date/time (`2022-01-01`), or a relative one (`1 year ago`).
8888

8989
### cron
90-
- Cron is used to schedule your model to process or refresh at a certain interval. It uses [croniter](https://github.com/kiorky/croniter) under the hood, so expressions such as `@daily` can be used. A model's `IntervalUnit` is determined implicity by the cron expression.
90+
- Cron is used to schedule your model to process or refresh at a certain interval. It uses [croniter](https://github.com/kiorky/croniter) under the hood, so expressions such as `@daily` can be used. A model's `IntervalUnit` is determined implicitly by the cron expression.
9191

9292
### storage_format
9393
- Storage format is a property for engines such as Spark or Hive that support storage formats such as `parquet` and `orc`.
@@ -112,7 +112,7 @@ For models that are incremental, the following parameters can be specified in th
112112
- Batch size is used to optimize backfilling incremental data. It determines the maximum number of intervals to run in a single job. For example, if a model specifies a cron of `@hourly` and a batch_size of `12`, when backfilling 3 days of data, the scheduler will spawn 6 jobs. (3 days * 24 hours/day = 72 hour intervals to fill. 72 intervals / 12 intervals per job = 6 jobs.)
113113

114114
## Macros
115-
Macros can be used for passing in paramaterized arguments such as dates, as well as for making SQL less repetitive. By default, SQLMesh provides several predefined macro variables that can be used. Macros are used by prefixing with the `@` symbol. For more information, refer to [macros](../macros.md).
115+
Macros can be used for passing in parameterized arguments such as dates, as well as for making SQL less repetitive. By default, SQLMesh provides several predefined macro variables that can be used. Macros are used by prefixing with the `@` symbol. For more information, refer to [macros](../macros.md).
116116

117117
## Statements
118118
Models can have additional statements that run before the main query. This can be useful for loading things such as [UDFs](../glossary.md#user-defined-function-udf).

docs/concepts/models/python_models.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
# Python models
22

3-
Although SQL is a powerful tool, some use cases are better handled by Python. For example, Pyton may be a better option in pipelines that involve machine learning, interacting with external APIs, or complex business logic that cannot be expressed in SQL.
3+
Although SQL is a powerful tool, some use cases are better handled by Python. For example, Python may be a better option in pipelines that involve machine learning, interacting with external APIs, or complex business logic that cannot be expressed in SQL.
44

55
SQLMesh has first-class support for models defined in Python; there are no restrictions on what can be done in the Python model as long as it returns a Pandas or Spark DataFrame instance.
66

docs/concepts/overview.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@
33
This page provides a conceptual overview of what SQLMesh does and how its components fit together.
44

55
## What SQLMesh is
6-
SQLMesh is a Python framework that automates everything needed to run a scaleable data transformation platform. SQLMesh works with a variety of [engines and orchestrators](../integrations/overview.md).
6+
SQLMesh is a Python framework that automates everything needed to run a scalable data transformation platform. SQLMesh works with a variety of [engines and orchestrators](../integrations/overview.md).
77

88
It was created with a focus on both data and organizational scale and works regardless of your data warehouse or SQL engine's capabilities.
99

docs/guides/projects.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44

55
---
66

7-
Before getting started, ensure that you meet the [prerequsities](../prerequisites.md) for using SQLMesh.
7+
Before getting started, ensure that you meet the [prerequisites](../prerequisites.md) for using SQLMesh.
88

99
---
1010

@@ -58,7 +58,7 @@ To create a project from the command line, follow these steps:
5858

5959
To edit an existing project, open the project file you wish to edit in your preferred editor.
6060

61-
If using CLI or Notebook, you can open a file in your project for editing by using the `sqlmesh` command with the `-p` varaible, and pointing to your project's path as follows:
61+
If using CLI or Notebook, you can open a file in your project for editing by using the `sqlmesh` command with the `-p` variable, and pointing to your project's path as follows:
6262
6363
```bash
6464
sqlmesh -p <your-project-path>

docs/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Here are some challenges that data teams run into, especially when data sizes in
1818
* Validating changes to data pipelines before deploying to production is an uncertain and sometimes expensive process. Although branches can be deployed to environments, when merged to production, the code is re-run. This is wasteful and generates uncertainty because the data is regenerated.
1919

2020
1. Silos transform data lakes to data swamps
21-
* The difficulty and cost of making changes to core pipelines can lead to duplicate pipelines with minor customizations. The inability to easily make and validate changes causes contributors to follow the "path of least resistence". The proliferation of similar tables leads to additional costs, inconsistencies, and maintenance burden.
21+
* The difficulty and cost of making changes to core pipelines can lead to duplicate pipelines with minor customizations. The inability to easily make and validate changes causes contributors to follow the "path of least resistance". The proliferation of similar tables leads to additional costs, inconsistencies, and maintenance burden.
2222

2323
## What is SQLMesh?
2424
SQLMesh consists of a CLI, a Python API, and a Web UI to make data pipeline development and deployment easy, efficient, and safe.

0 commit comments

Comments
 (0)