Monday, 30 November 2015

MongoDB : How GridFS handles data more than 16MB

The maximum document size in mongoDB is 16MB. To store documents larger than the maximum size (16 MB), MongoDB provides the GridFS API.

How GridFS handles data more than 16MB
Instead of storing a file in a single document, GridFS divides a file into parts, called chunks. Each chunk stored as a single document.  

GridFS uses two collections to store files. One to store meta data about chunks, and other to store chunks. When you query a GridFS store for a file, the driver or client will reassemble the chunks as needed.

package com.orient.kalyan.hadoop.training;

import java.io.File;
import com.mongodb.DB;
import com.mongodb.MongoClient;
import com.mongodb.gridfs.GridFS;
import com.mongodb.gridfs.GridFSDBFile;
import com.mongodb.gridfs.GridFSInputFile;

public class GRID_FS_Example {

 /* Step 1 : get mongoCLient */
 public static MongoClient getMongoClient() {
  MongoClient mongoClient = null;
  try {
   mongoClient = new MongoClient("localhost", 27017);
  } catch (Exception e) {
   e.printStackTrace();
  }
  return mongoClient;
 }

 public static void saveFIle(DB db, File file)throws Exception{
  GridFS gridfs = new GridFS(db, "videos");
  GridFSInputFile gfsFile = gridfs.createFile(file);
  gfsFile.setFilename("myvideo");
  gfsFile.save();
 }
 
 public static void getFile(DB db){
  String newFileName = "myvideo";
  GridFS gfsPhoto = new GridFS(db, "videos");
  GridFSDBFile imageForOutput = gfsPhoto.findOne(newFileName);
  System.out.println(imageForOutput);
 }
 
 
 public static void main(String[] args) throws Exception {

  MongoClient  mongoClient = getMongoClient();
  
  DB db = mongoClient.getDB("test");

  File file = new File("/home/hadoop/work/input/hello.mp4");
  saveFIle(db, file);
  
  getFile(db);
  
  System.out.println("done");
 }
}


Output
{ "filename" : "myvideo" , "aliases" :  null  , "chunkSize" : 261120 , "uploadDate" : { "$date" : "2015-11-24T08:45:25.365Z"} , "length" : 915607173 , "_id" : { "$oid" : "54c35ba5df161e5e1b7a6491"} , "contentType" :  null  , "md5" : "8ce772b47ddbb007a089b103579140dc"}
done


After running above java program, it creates two collections, one for to store chunks and other to store metadata.

> show collections
kalyan
system.indexes
videos
videos.chunks
videos.files

Related Posts Plugin for WordPress, Blogger...