CLI tool for extracting values of a specific field from a MongoDB collection and saving them into a target collection.
Supports batching, large dataset processing, and flexible write configurations.
- Extract values of any
field
from MongoDB documents. - Data filtering using
$match
. Batching
(batchSize) to avoid MongoDB’s 16MB per-document limit.ObjectId
transformation: ObjectId('68a8c8207090be6dd0e23a90') → '68a8c8207090be6dd0e23a90'.- Large collections supported via
allowDiskUse
. - Flexible array handling:
- Overwrite or append to arrays.
- Allow or eliminate duplicates.
- Informative logs:
- Install the package:
npm i mongo-collector
- Add a script in your package.json:
"scripts": {
"mongoCollector": "mongo-collector"
}
- In the root of the project, create a file — mongo-collector.config.js.
Example of file contents:
export default {
source: {
uri: "mongodb://127.0.0.1:27017",
db: "crystalTest",
collection: "users",
field: "_id",
match: {}
},
target: {
uri: "mongodb://127.0.0.1:27017",
db: "pool",
collection: "usersIdFromCrystalTest",
field: "users",
documentId: false,
rewriteDocuments: true,
rewriteArray: true,
duplicatesInArray: false,
unwrapObjectId: true
},
aggregation: {
allowDiskUse: true,
batchSize: 200
},
};
- Run from the project root:
npm run mongoCollector
Source collection users (from source):
{ "_id": ObjectId("68a8c8207090be6dd0e23a90"), "name": "Alice" }
{ "_id": ObjectId("68a8c8207090be6dd0e23a91"), "name": "Sarah" }
{ "_id": ObjectId("68a8c8207090be6dd0e23a92"), "name": "John" }
After running mongo-collector, in the target collection usersIdFromCrystal:
{ "users": [ "68a8c8207090be6dd0e23a90", "68a8c8207090be6dd0e23a91", "68a8c8207090be6dd0e23a92" ] }
You can do any match
configurations, for example:
match: {}
- take all documents.
match: { createdAt: { $gte: new Date("2025-08-20T01:26:11.327+00:00") } }
- filter documents by date.
documentId: false
- create a new document.
documentId: '68a8c8207090be6dd0e23a90'
- append data to an existing document, or create one with this _id if missing.
rewriteDocuments: true
- clear the entire target collection before writing.
true
- overwrite array
false
- append to an existing array
false
- eliminate duplicates (uses $addToSet
)
true
- ObjectId('68a8c8207090be6dd0e23a90') → '68a8c8207090be6dd0e23a90' (final result in target).
true
- allows MongoDB to write temporary data to disk when processing aggregation stages.
- Use this option for large datasets to avoid memory limitations.
false
- restricts processing to memory only.
- This can improve performance, but may result in errors if the dataset is too large to fit into memory.
batchSize: 10
- controls the length of the array inside each target document.
An example of mongoCollector in operation: