Provided by: libmongoc-doc_1.3.1-1_all bug

NAME

       Aggregation_Framework_Examples - None

REQUIREMENTS

       MongoDB , version 2.2.0 or later.  MongoDB C driver , version 0.96.0 or later.

       Let's check if everything is installed.

       Use the following command to load zips.json data set into mongod instance:

       $

       Let's use the MongoDB shell to verify that everything was imported successfully.

       $
       MongoDB shell version: 2.6.1
       connecting to: test
       > 29467 >
       {
            "_id" : "35004",
            "city" : "ACMAR",
            "loc" : [
                 ‐86.51557,
                 33.584132
            ],
            "pop" : 6055,
            "state" : "AL"
       }

AGGREGATIONS USING THE ZIP CODES DATA SET

       Each document in this collection has the following form:

       {
         "_id" : "35004",
         "city" : "Acmar",
         "state" : "AL",
         "pop" : 6055,
         "loc" : [‐86.51557, 33.584132]
       }

       In these documents:

       \[bu]
         The _id field holds the zipcode as a string.

       \[bu]
         The city field holds the city name.

       \[bu]
         The state field holds the two letter state abbreviation.

       \[bu]
         The pop field holds the population.

       \[bu]
         The loc field holds the location as a [latitude, longitude] array.

STATES WITH POPULATIONS OVER 10 MILLION

       To get all states with a population greater than 10 million, use the following aggregation
       pipeline:

       #include <mongoc.h>
       #include <stdio.h>

       static void
       print_pipeline (mongoc_collection_t *collection)
       {
          mongoc_cursor_t *cursor;
          bson_error_t error;
          const bson_t *doc;
          bson_t *pipeline;
          char *str;

          pipeline = BCON_NEW ("pipeline", "[",
             "{", "$group", "{", "_id", "$state", "total_pop", "{", "$sum", "$pop", "}", "}", "}",
             "{", "$match", "{", "total_pop", "{", "$gte", BCON_INT32 (10000000), "}", "}", "}",
          "]");

          cursor = mongoc_collection_aggregate (collection, MONGOC_QUERY_NONE, pipeline, NULL, NULL);

          while (mongoc_cursor_next (cursor, &doc)) {
             str = bson_as_json (doc, NULL);
             printf ("%s\n", str);
             bson_free (str);
          }

          if (mongoc_cursor_error (cursor, &error)) {
             fprintf (stderr, "Cursor Failure: %s\n", error.message);
          }

          mongoc_cursor_destroy (cursor);
          bson_destroy (pipeline);
       }

       int
       main (int argc,
             char *argv[])
       {
          mongoc_client_t *client;
          mongoc_collection_t *collection;

          mongoc_init ();

          client = mongoc_client_new ("mongodb://localhost:27017");
          collection = mongoc_client_get_collection (client, "test", "zipcodes");

          print_pipeline (collection);

          mongoc_collection_destroy (collection);
          mongoc_client_destroy (client);

          mongoc_cleanup ();

          return 0;
       }

       You should see a result like the following:

       { "_id" : "PA", "total_pop" : 11881643 }
       { "_id" : "OH", "total_pop" : 10847115 }
       { "_id" : "NY", "total_pop" : 17990455 }
       { "_id" : "FL", "total_pop" : 12937284 }
       { "_id" : "TX", "total_pop" : 16986510 }
       { "_id" : "IL", "total_pop" : 11430472 }
       { "_id" : "CA", "total_pop" : 29760021 }

       The above aggregation pipeline is build from two pipeline operators: $group and $match \&.

       The $group pipeline operator requires _id  field  where  we  specify  grouping;  remaining
       fields  specify  how to generate composite value and must use one of the group aggregation
       functions: $addToSet , $first , $last , $max , $min , $avg , $push , $sum \&.  The  $match
       pipeline operator syntax is the same as the read operation query syntax.

       The  $group process reads all documents and for each state it creates a separate document,
       for example:

       { "_id" : "WA", "total_pop" : 4866692 }

       The total_pop field uses the $sum aggregation function to sum the values of all pop fields
       in the source documents.

       Documents  created  by  $group  are  piped to the $match pipeline operator. It returns the
       documents with the value of total_pop field greater than or equal to 10 million.

AVERAGE CITY POPULATION BY STATE

       To get the first three states with the greatest  average  population  per  city,  use  the
       following aggregation:

       pipeline = BCON_NEW ("pipeline", "[",
          "{", "$group", "{", "_id", "{", "state", "$state", "city", "$city", "}", "pop", "{", "$sum", "$pop", "}", "}", "}",
          "{", "$group", "{", "_id", "$_id.state", "avg_city_pop", "{", "$avg", "$pop", "}", "}", "}",
          "{", "$sort", "{", "avg_city_pop", BCON_INT32 (‐1), "}", "}",
          "{", "$limit", BCON_INT32 (3) "}",
       "]");

       This aggregate pipeline produces:

       { "_id" : "DC", "avg_city_pop" : 303450.0 }
       { "_id" : "FL", "avg_city_pop" : 27942.29805615551 }
       { "_id" : "CA", "avg_city_pop" : 27735.341099720412 }

       The  above aggregation pipeline is build from three pipeline operators: $group , $sort and
       $limit \&.

       The first $group operator creates the following documents:

       { "_id" : { "state" : "WY", "city" : "Smoot" }, "pop" : 414 }

       Note, that the $group operator can't use nested documents except the _id field.

       The second $group uses these documents to create the following documents:

       { "_id" : "FL", "avg_city_pop" : 27942.29805615551 }

       These documents are sorted by the avg_city_pop field in  descending  order.  Finally,  the
       $limit pipeline operator returns the first 3 documents from the sorted set.

COLOPHON

       This    page   is   part   of   MongoDB   C   Driver.    Please   report   any   bugs   at
       https://jira.mongodb.org/browse/CDRIVER.