Skip to content

Commit 5c0a678

Browse files
committed
add user dataset
1 parent d13404e commit 5c0a678

File tree

6 files changed

+6156
-5
lines changed

6 files changed

+6156
-5
lines changed

docs/003-create-index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -85,7 +85,7 @@ This is where RediSearch module is helping, and why it as been created.
8585

8686
RediSearch simplifies a lot this by offering a simple and automatic way to create secondary indices on Redis Hashes. (more datastructure will eventually come)
8787

88-
![Secondary Index](https://raw.githubusercontent.com/RediSearch/redisearch-getting-started/blob/master/docs/images/secondary-index.png)
88+
![Secondary Index](https://github.com/RediSearch/redisearch-getting-started/blob/master/docs/images/secondary-index.png?raw=true)
8989

9090
Using RediSearch if you want to query on a field, you must index the fields. Let's start by indexing the following fields in of our movies:
9191

docs/006-import-dataset.md

Lines changed: 120 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,10 @@
11
# Sample Dataset
22

3-
In the previous steps you used only few movies, let's now import more movies and some theaters (to discover the geospatial capabilities)
3+
In the previous steps you used only few movies, let's now import:
4+
5+
* More movies *to discover more queries*.
6+
* Theaters *to discover the geospatial capabilities*.
7+
* Users *to do some aggregations*.
48

59
## Dataset Description
610

@@ -154,7 +158,110 @@ The theater hashes contain the following fields.
154158
</table>
155159
</details>
156160

157-
## Importing the Movies & Theaters
161+
162+
**Users**
163+
164+
The file `sample-app/redisearch-docker/dataset/import_users.redis` is a script that create 5996 Hashes.
165+
166+
The user hashes contain the following fields.
167+
168+
* **`user:id`** : The unique ID of the user.
169+
* **`first_name`** : The first name of the user.
170+
* **`last_name`** : The last name of the user.
171+
* **`email`** : The email the user.
172+
* **`gender`** : The gender the user (`female`/`male`).
173+
* **`country`** : The country name the user.
174+
* **`country_code`** : The country code the user.
175+
* **`city`** : The city the user.
176+
* **`longitude`** : The longitude the user.
177+
* **`latitude`** : The latitude the user.
178+
* **`last_login`** : The last login the user, as EPOC time.
179+
* **`ip_address`** : The IP address of the user.
180+
181+
<details>
182+
<summary>Sample Data: <b>user:3233</b></summary>
183+
<table>
184+
<thead>
185+
<tr>
186+
<th>Field</th>
187+
<th>Value</th>
188+
</tr>
189+
</thead>
190+
<tbody>
191+
<tr>
192+
<th>first_name</th>
193+
<td style='font-family:monospace; font-size: 0.875em; "'>
194+
Rosetta
195+
</td>
196+
</tr>
197+
<tr>
198+
<th>last_name</th>
199+
<td style='font-family:monospace; font-size: 0.875em; "'>
200+
Olyff
201+
</td>
202+
</tr>
203+
<tr>
204+
<th>email</th>
205+
<td style='font-family:monospace; font-size: 0.875em; "'>
206+
207+
</td>
208+
</tr>
209+
<tr>
210+
<th>gender</th>
211+
<td style='font-family:monospace; font-size: 0.875em; "'>
212+
female
213+
</td>
214+
</tr>
215+
<tr>
216+
<th>country</th>
217+
<td style='font-family:monospace; font-size: 0.875em; "'>
218+
China
219+
</td>
220+
</tr>
221+
<tr>
222+
<th>country_code</th>
223+
<td style='font-family:monospace; font-size: 0.875em; "'>
224+
CN
225+
</td>
226+
</tr>
227+
<tr>
228+
<th>city</th>
229+
<td style='font-family:monospace; font-size: 0.875em; "'>
230+
Huangdao
231+
</td>
232+
</tr>
233+
<tr>
234+
<th>longitude</th>
235+
<td style='font-family:monospace; font-size: 0.875em; "'>
236+
120.04619
237+
</td>
238+
</tr>
239+
<tr>
240+
<th>latitude</th>
241+
<td style='font-family:monospace; font-size: 0.875em; "'>
242+
35.872664
243+
</td>
244+
</tr>
245+
<tr>
246+
<th>last_login</th>
247+
<td style='font-family:monospace; font-size: 0.875em; "'>
248+
1570386621
249+
</td>
250+
</tr>
251+
<tr>
252+
<th>ip_address</th>
253+
<td style='font-family:monospace; font-size: 0.875em; "'>
254+
218.47.90.79
255+
</td>
256+
</tr>
257+
<tbody>
258+
</table>
259+
</details>
260+
261+
262+
---
263+
264+
## Importing the Movies, Theaters and Users
158265

159266
Before importing the data, flush the database:
160267

@@ -170,6 +277,9 @@ $ redis-cli -h localhost -p 6379 < ./sample-app/redisearch-docker/dataset/import
170277
171278
$ redis-cli -h localhost -p 6379 < ./sample-app/redisearch-docker/dataset/import_theaters.redis
172279
280+
281+
$ redis-cli -h localhost -p 6379 < ./sample-app/redisearch-docker/dataset/import_users.redis
282+
173283
```
174284

175285

@@ -186,11 +296,18 @@ Using Redis Insight or the redis-cli you can look at the dataset:
186296
> HMGET "theater:20" name location
187297
1) "Broadway Theatre"
188298
2) "-73.98335054631019,40.763270202723625"
299+
300+
301+
302+
> HMGET "user:343" first_name last_name last_login
303+
1) "Umeko"
304+
2) "Castagno"
305+
3) "1574769122"
306+
189307
```
190308

191309
You can also use the `DBSIZE` command to see how many keys you have in your database.
192310

193-
The script create 922 movies and 117 theathers, and not the RediSearch index, this will be done in the next step.
194311

195312
---
196313
Next: [Querying the movie database](007-query-movies.md)

docs/007-query-movies.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@
66
**Create the `idx:movie` index:**
77

88
```
9-
> FT.CREATE idx:movie ON hash PREFIX 1 "movie:" SCHEMA title TEXT SORTABLE plot TEXT WEIGHT 0.5 release_year NUMERIC SORTABLE rating NUMERIC SORTABLE genre TAG SORTABLE
9+
> FT.CREATE idx:movie ON hash PREFIX 1 "movie:" SCHEMA title TEXT SORTABLE plot TEXT WEIGHT 0.5 release_year NUMERIC SORTABLE rating NUMERIC SORTABLE votes NUMERIC SORTABLE genre TAG SORTABLE
1010
1111
"OK"
1212
```

docs/008-aggregation.md

Lines changed: 35 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,41 @@ Let's now take some examples.
6262
---
6363
</details>
6464

65+
<details>
66+
<summary>
67+
<i><b>
68+
Number of movies by genre, with the total number of votes, and average rating
69+
</b></i>
70+
</summary>
71+
72+
```
73+
> FT.AGGREGATE idx:movie "*" GROUPBY 1 @genre REDUCE COUNT 0 AS nb_of_movies REDUCE SUM 1 votes AS nb_of_votes REDUCE AVG 1 rating AS avg_rating SORTBY 4 @avg_rating DESC @nb_of_votes DESC
74+
75+
76+
1) (integer) 26
77+
2) 1) "genre"
78+
2) "fantasy"
79+
3) "nb_of_movies"
80+
4) "1"
81+
5) "nb_of_votes"
82+
6) "1500090"
83+
7) "avg_rating"
84+
8) "8.8"
85+
...
86+
11) 1) "genre"
87+
2) "romance"
88+
3) "nb_of_movies"
89+
4) "2"
90+
5) "nb_of_votes"
91+
6) "746"
92+
7) "avg_rating"
93+
8) "6.65"
94+
```
95+
96+
97+
98+
---
99+
</details>
65100

66101
----
67102
Next: [Advanced Options](009-advanced-features.md)

0 commit comments

Comments
 (0)