Skip to content

Commit 7283dc1

Browse files
committed
Asynchronous GridFS Implementation
Uses a custom AsyncInputStream and AsyncOutputStream for easy adaptability to custom async byte I/O. JAVA-1282
1 parent c673814 commit 7283dc1

File tree

41 files changed

+6822
-173
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+6822
-173
lines changed

config/checkstyle-exclude.xml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@
3737

3838
<!--Do not check The GridFSTour -->
3939
<suppress checks="Regexp" files="GridFSTour"/>
40+
<suppress checks="MethodLength" files="GridFSTour"/>
4041

4142
<!--DBCollection is insanely long, and we should not compromise every class length for this one-->
4243
<suppress checks="FileLength" files="DBCollection"/>
Lines changed: 314 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,314 @@
1+
+++
2+
date = "2015-11-27T12:00:00-00:00"
3+
title = "GridFS"
4+
[menu.main]
5+
parent = "Async Reference"
6+
identifier = "Async GridFS"
7+
weight = 80
8+
pre = "<i class='fa'></i>"
9+
+++
10+
11+
12+
## GridFS
13+
14+
GridFS is a specification for storing and retrieving files that exceed the BSON-document size limit of 16MB.
15+
16+
Instead of storing a file in a single document, GridFS divides a file into parts, or chunks, and stores each of those chunks as a separate document. By default GridFS limits chunk size to 255k. GridFS uses two collections to store files. The chunks collection stores the file chunks, and the files collection stores the file metadata.
17+
18+
When you query a GridFS store for a file, the driver or client will reassemble the chunks as needed. GridFS is useful not only for storing files that exceed 16MB but also for storing any files for which you want access without having to load the entire file into memory.
19+
20+
{{% note %}}
21+
For more information about GridFS see the [MongoDB GridFS documentation](http://docs.mongodb.org/manual/core/gridfs/).
22+
{{% /note %}}
23+
24+
The following code snippets come from the `GridFSTour.java` example code
25+
that can be found with the [driver source]({{< srcref "driver-async/src/examples/gridfs/GridFSTour.java">}}).
26+
27+
{{% note class="important" %}}
28+
It's important to always check for errors in any `SingleResponseCallback<T>` implementation and handle them appropriately!
29+
Below the error checks are left out only for the sake of brevity.
30+
{{% /note %}}
31+
32+
## Async Streams
33+
34+
As there are multiple API's for Asynchronous I/O on the JVM the GridFS library uses a flexible interfaces for asynchronous input and output.
35+
The [`AsyncInputStream`]({{< apiref "com/mongodb/async/client/gridfs/AsyncInputStream" >}}) interface represents an `InputStream`
36+
and the [`AsyncOutputStream`]({{< apiref "com/mongodb/async/client/gridfs/AsyncOutputStream" >}}) interface represents an `OutputStream`.
37+
38+
In addition to these interfaces there are the following helpers:
39+
40+
* [`AsyncStreamHelper`]({{< apiref "com/mongodb/async/client/gridfs/helpers/AsyncStreamHelper" >}}) which provides support for:
41+
* `byte[]`
42+
* `ByteBuffer`
43+
* `InputStream` - note: input streams are blocking
44+
* `OutputStream` - note: output streams are blocking
45+
46+
* [`AsynchronousChannelHelper`]({{< apiref "com/mongodb/async/client/gridfs/helpers/AsynchronousChannelHelper" >}}) which provides support for:
47+
* `AsynchronousByteChannel`
48+
* `AsynchronousFileChannel`
49+
50+
These interfaces should be easy to wrap for any alternative asynchronous I/O implementations such as Netty or Vertx.
51+
52+
## Connecting to GridFS
53+
54+
Interactions with GridFS are done via the [`GridFSBucket`]({{< apiref "com/mongodb/async/client/gridfs/GridFSBucket" >}}) class.
55+
To create a `GridFSBucket` use the [`GridFSBuckets`]({{< apiref "com/mongodb/async/client/gridfs/GridFSBuckets" >}}) factory class.
56+
57+
Creating a `GridFSBucket` requires an instance of a
58+
[`MongoDatabase`]({{< apiref "com/mongodb/async/client/MongoDatabase" >}}) and you can optionally provide a custom bucket name.
59+
60+
The following example shows how to create a `GridFSBucket`:
61+
62+
```java
63+
// Create a gridFSBucket using the default bucket name "fs"
64+
GridFSBucket gridFSBucket = GridFSBuckets.create(myDatabase);
65+
66+
// Create a gridFSBucket with a custom bucket name "files"
67+
GridFSBucket gridFSBucket = GridFSBuckets.create(myDatabase, "files");
68+
```
69+
70+
## Uploading to GridFS
71+
72+
There are two main ways to upload data into GridFS.
73+
74+
### UploadFromStream
75+
76+
The [`uploadFromStream`]({{< apiref "com/mongodb/async/client/gridfs/GridFSBucket.html#openUploadStream-java.lang.String-com.mongodb.client.gridfs.model.GridFSUploadOptions-" >}}) method
77+
reads the contents of an [`AsyncInputStream`]({{< apiref "com/mongodb/async/client/gridfs/AsyncInputStream" >}}) and saves it to the `GridFSBucket`.
78+
The size of the chunks defaults to 255 bytes, but can be configured via the [`GridFSUploadOptions`]({{< apiref "com/mongodb/async/client/gridfs/model/GridFSUploadOptions" >}}).
79+
80+
The following example uploads an `AsyncInputStream` into `GridFSBucket`:
81+
82+
```java
83+
// Get the input stream
84+
Path inputPath = Paths.get("/tmp/mongodb-tutorial.pdf");
85+
AsynchronousFileChannel streamToDownloadTo = AsynchronousFileChannel.open(outputPath, StandardOpenOptionRead);
86+
final AsyncInputStream streamToUploadFrom = channelToInputStream(streamToDownloadTo); // Using the AsynchronousChannelHelper
87+
88+
// Create some custom options
89+
GridFSUploadOptions options = new GridFSUploadOptions()
90+
.chunkSizeBytes(1024 * 1024)
91+
.metadata(new Document("type", "presentation"));
92+
93+
gridFSBucket.uploadFromStream("mongodb-tutorial", streamToUploadFrom, options,
94+
new SingleResultCallback<ObjectId>() {
95+
@Override
96+
public void onResult(final ObjectId result, final Throwable t) {
97+
System.out.println("The fileId of the uploaded file is: " + result.toHexString());
98+
streamToUploadFrom.close(new SingleResultCallback<Void>() {
99+
@Override
100+
public void onResult(final Void result, final Throwable t) {
101+
// Stream closed
102+
}
103+
});
104+
}
105+
}
106+
);
107+
```
108+
109+
### OpenUploadStream
110+
111+
The [`openUploadStream`]({{< apiref "com/mongodb/async/client/gridfs/GridFSBucket.html#openUploadStream-java.lang.String-com.mongodb.client.gridfs.model.GridFSUploadOptions-">}}) method returns a [`GridFSUploadStream`]({{< apiref "mongodb/client/gridfs/GridFSUploadStream.html">}}) which extends [`AsyncOutputStream`]({{< apiref "com/mongodb/async/client/gridfs/AsyncOutputStream" >}}) and can be written to.
112+
113+
The `GridFSUploadStream` buffers data until it reaches the `chunkSizeBytes` and then inserts the chunk into the chunks collection.
114+
When the `GridFSUploadStream` is closed, the final chunk is written and the file metadata is inserted into the files collection.
115+
116+
The following example uploads an into `GridFSBucket` via the returned `OutputStream`:
117+
118+
```java
119+
ByteBuffer data = ByteBuffer.wrap("Data to upload into GridFS".getBytes(StandardCharsets.UTF_8));
120+
final GridFSUploadStream uploadStream = gridFSBucket.openUploadStream("sampleData");
121+
uploadStream.write(data, new SingleResultCallback<Integer>() {
122+
@Override
123+
public void onResult(final Integer result, final Throwable t) {
124+
System.out.println("The fileId of the uploaded file is: " + uploadStream.getFileId().toHexString());
125+
126+
uploadStream.close(new SingleResultCallback<Void>() {
127+
@Override
128+
public void onResult(final Void result, final Throwable t) {
129+
// Stream closed
130+
}
131+
});
132+
}
133+
});
134+
```
135+
136+
{{% note %}}
137+
GridFS will automatically create indexes on the files and chunks collections on first upload of data into the GridFS bucket.
138+
{{% /note %}}
139+
140+
## Finding files stored in GridFS
141+
142+
To find the files stored in the `GridFSBucket` use the [`find`]({{< apiref "com/mongodb/async/client/gridfs/GridFSBucket.html#find--">}}) method.
143+
144+
The following example prints out the filename of each file stored:
145+
146+
```java
147+
gridFSBucket.find().forEach(
148+
new Block<GridFSFile>() {
149+
@Override
150+
public void apply(final GridFSFile gridFSFile) {
151+
System.out.println(gridFSFile.getFilename());
152+
}
153+
},
154+
new SingleResultCallback<Void>() {
155+
@Override
156+
public void onResult(final Void result, final Throwable t) {
157+
// Finished
158+
}
159+
}
160+
);
161+
```
162+
163+
You can also provide a custom filter to limit the results returned. The following example prints out the filenames of all files with a "image/png" value set as the contentType in the user defined metadata document:
164+
165+
```java
166+
gridFSBucket.find(eq("metadata.contentType", "image/png")).forEach(
167+
new Block<GridFSFile>() {
168+
@Override
169+
public void apply(final GridFSFile gridFSFile) {
170+
System.out.println(gridFSFile.getFilename());
171+
}
172+
},
173+
new SingleResultCallback<Void>() {
174+
@Override
175+
public void onResult(final Void result, final Throwable t) {
176+
// Finished
177+
}
178+
}
179+
);
180+
```
181+
182+
## Downloading from GridFS
183+
184+
There are four main ways to download data from GridFS.
185+
186+
### DownloadFromStream
187+
188+
The [`downloadToStream`]({{< apiref "com/mongodb/async/client/gridfs/GridFSBucket.html#downloadToStream-org.bson.types.ObjectId-java.io.OutputStream-" >}}) method reads the contents from MongoDB and writes the data directly to the provided [`AsyncOutputStream`]({{< apiref "com/mongodb/async/client/gridfs/AsyncOutputStream" >}}).
189+
190+
The following example downloads a file into the provided `OutputStream`:
191+
192+
```java
193+
Path outputPath = Paths.get("/tmp/mongodb-tutorial.pdf");
194+
final AsynchronousFileChannel streamToDownloadTo = AsynchronousFileChannel.open(outputPath, StandardOpenOption.CREATE_NEW,
195+
StandardOpenOption.WRITE, StandardOpenOption.DELETE_ON_CLOSE);
196+
gridFSBucket.downloadToStream(fileId, channelToOutputStream(streamToDownloadTo), new SingleResultCallback<Long>() {
197+
@Override
198+
public void onResult(final Long result, final Throwable t) {
199+
streamToDownloadTo.close();
200+
System.out.println("downloaded file sized: " + result);
201+
}
202+
});
203+
```
204+
205+
### DownloadToStreamByName
206+
207+
If you don't know the [`ObjectId`]({{< apiref "org/bson/types/ObjectId.html">}}) of the file you want to download, then you use the [`downloadToStreamByName`]({{< apiref "com/mongodb/async/client/gridfs/GridFSBucket.html#downloadToStreamByName-java.lang.String-java.io.OutputStream-com.mongodb.client.gridfs.model.GridFSDownloadByNameOptions-" >}}) method. By default it will download the latest version of the file. Use the [`GridFSDownloadByNameOptions`]({{< apiref "com/mongodb/async/client/gridfs/model/GridFSDownloadByNameOptions.html" >}}) to configure which version to download.
208+
209+
The following example downloads the original version of the file named "mongodb-tutorial" into the `OutputStream`:
210+
211+
```java
212+
final streamToDownloadTo = AsynchronousFileChannel.open(outputPath, StandardOpenOption.CREATE_NEW, StandardOpenOption.WRITE,
213+
StandardOpenOption.DELETE_ON_CLOSE);
214+
GridFSDownloadByNameOptions downloadOptions = new GridFSDownloadByNameOptions().revision(0);
215+
gridFSBucket.downloadToStreamByName("mongodb-tutorial", channelToOutputStream(streamToDownloadTo), downloadOptions,
216+
new SingleResultCallback<Long>() {
217+
@Override
218+
public void onResult(final Long result, final Throwable t) {
219+
System.out.println("downloaded file sized: " + result);
220+
streamToDownloadTo.close();
221+
}
222+
}
223+
);
224+
```
225+
226+
### OpenDownloadStream
227+
228+
The [`openDownloadStream`]({{< apiref "com/mongodb/async/client/gridfs/GridFSBucket.html#openDownloadStream-org.bson.types.ObjectId-">}}) method returns a [`GridFSDownloadStream`]({{< apiref "mongodb/client/gridfs/GridFSDownloadStream.html">}}) which extends [`AsyncInputStream`]({{< apiref "com/mongodb/async/client/gridfs/AsyncInputStream" >}}) and can be read from.
229+
230+
The following example reads from the `GridFSBucket` via the returned `AsyncInputStream`:
231+
232+
```java
233+
final ByteBuffer dstByteBuffer = ByteBuffer.allocate(1024 * 1024);
234+
final GridFSDownloadStream downloadStream = gridFSBucket.openDownloadStream(fileId);
235+
downloadStream.read(dstByteBuffer, new SingleResultCallback<Integer>() {
236+
@Override
237+
public void onResult(final Integer result, final Throwable t) {
238+
dstByteBuffer.flip();
239+
byte[] bytes = new byte[result];
240+
dstByteBuffer.get(bytes);
241+
System.out.println(new String(bytes, StandardCharsets.UTF_8));
242+
243+
downloadStream.close(new SingleResultCallback<Void>() {
244+
@Override
245+
public void onResult(final Void result, final Throwable t) {
246+
// Finished
247+
}
248+
});
249+
}
250+
});
251+
```
252+
253+
### OpenDownloadStreamByName
254+
255+
You can also open a `GridFSDownloadStream` by searching against the filename, using the [`openDownloadStreamByName`]({{< apiref "com/mongodb/async/client/gridfs/GridFSBucket.html#openDownloadStreamByName-java.lang.String-com.mongodb.client.gridfs.model.GridFSDownloadByNameOptions-" >}}) method. By default it will download the latest version of the file. Use the [`GridFSDownloadByNameOptions`]({{< apiref "com/mongodb/async/client/gridfs/model/GridFSDownloadByNameOptions.html" >}}) to configure which version to download.
256+
257+
The following example downloads the latest version of the file named "sampleData" into the `dstByteBuffer` ByteBuffer:
258+
259+
```java
260+
final GridFSDownloadStream downloadStreamByName = gridFSBucket.openDownloadStreamByName("sampleData");
261+
final ByteBuffer dstByteBuffer = ByteBuffer.allocate(1024 * 1024);
262+
downloadStreamByName.read(dstByteBuffer, new SingleResultCallback<Integer>() {
263+
@Override
264+
public void onResult(final Integer result, final Throwable t) {
265+
dstByteBuffer.flip();
266+
byte[] bytes = new byte[result];
267+
dstByteBuffer.get(bytes);
268+
System.out.println(new String(bytes, StandardCharsets.UTF_8));
269+
270+
downloadStreamByName.close(new SingleResultCallback<Void>() {
271+
@Override
272+
public void onResult(final Void result, final Throwable t) {
273+
// Finished
274+
}
275+
});
276+
}
277+
});
278+
```
279+
280+
## Renaming files
281+
282+
If you should need to rename a file, then the [`rename`]({{< apiref "com/mongodb/async/client/gridfs/GridFSBucket.html#rename-org.bson.types.ObjectId-java.lang.String-">}}) method can be used.
283+
284+
The following example renames a file to "mongodbTutorial":
285+
286+
```java
287+
gridFSBucket.rename(fileId, "mongodbTutorial", new SingleResultCallback<Void>() {
288+
@Override
289+
public void onResult(final Void result, final Throwable t) {
290+
System.out.println("Renamed file");
291+
}
292+
});
293+
```
294+
295+
{{% note %}}
296+
The `rename` method requires an `ObjectId` rather than a `filename` to ensure the correct file is renamed.
297+
298+
To rename multiple revisions of the same filename, first retrieve the full list of files. Then for every file that should be renamed then execute `rename` with the corresponding `_id`.
299+
{{% /note %}}
300+
301+
## Deleting files
302+
303+
To delete a file from the `GridFSBucket` use the [`delete`]({{< apiref "com/mongodb/async/client/gridfs/GridFSBucket.html#delete-org.bson.types.ObjectId-">}}) method.
304+
305+
The following example deletes a file from the `GridFSBucket`:
306+
307+
```java
308+
gridFSBucket.delete(fileId, new SingleResultCallback<Void>() {
309+
@Override
310+
public void onResult(final Void result, final Throwable t) {
311+
System.out.println("Deleted file");
312+
}
313+
});
314+
```

0 commit comments

Comments
 (0)