So with these new requirements a few things stand out to me
- Weekly updates
- Comparison with the previous week check
- Possible future data analyzation.
So the first part is what we knew before, as product prices would be changed somewhat often, but not crazy often, and technically could stay under the 16mb limit for the foreseeable future if you nested it.
But the other two parts are what I’d consider more in deciding factors on how to setup the DB. Namely to do the comparison you only need the previous week value and the current week value (which you get thru web scraping) as such you will probably want to get the last value for a given item as efficiently as possible (IE you dont load every single item into nodejs/memory and run find on it)
I feel like there are more capabilities to have the prices separated into its own schema to get some optimizations to get the last value than if you nested the values.
On-top of this, the third potential requirement (doing some data analyzation) over all the product prices probably wont be performed often, so the performance impact of finding the prices for a given product against all products shouldn’t be an issue, and there are a number of ways to optimize this query, and be used perform simpler queries against all prices for any number of products/product types. (lost of potential here :D)
So to sum it up i’d say just create a different schema and leverage more of mongodb utilities and use a good old fashion lookup to get the prices for your production. The advantages with two schemas out weights the advantages of nesting the data I believe
PS. There is a reason why SQL is still as useful as NOSQL, and remains popular. It’s because it’s structured data still works for most use-cases, and it’s optimized for said use-cases. And so you could of done this in SQL with the same setup and reaped the same benefits. Not saying you should switch over right now, just wanted to give some “food for thought” for the future