Skip to content

Adding write-time aggregation functions #35

@JohnAD

Description

@JohnAD

Typically, in a web site, the usage data is written lightly and as fast as possible to a log. Then, a secondary, process (and possibly a secondary server) does statistical analysis on that data. This totally make sense for storage in a traditional database such as SQL and certainly for an apache-style text log.

Philosophically, however, MongoDB (and related NoSQL) databases can take a different approach. They are designed for scalablity using a variety of techniques: including an emphasis of read-optimization at the expense of write-optimization. The lack of normalization, for example, makes it expensive to update (write) certain data because such updates might have to occur across many documents. But in exchange for that, a read of any one document in a collection need never reference another document because all the important information is already gathered.

Sorry be so windy, but I'm wanting to justify my crazy idea. :) And that is this:

Rather than just write a single document to a collection on each page response, also allow aggregate updates on other documents at the same time.

For example, it is common in web log analysis to record the number of visits to each page over certain periods of time. Say hourly, daily, monthly. So, when flask-track-usage records a response, it could also upsert the url/datetime/period documents corresponding to it with incremented totals. In this example, it would update 3 additional documents.

This would be implemented as an option of course. There would be scenarios where such aggregate work would be a bad idea or pointless.

One possiblity is to have it done as a post-storage function call. For example:

def myCrazySummationUtility(data):
    # here is where I do all my extra stuff with the dictionary
    # contained in 'data'
    ....

t = TrackUsage(app, PrintStorage(post=myCrazySummation))

Thoughts?

Metadata

Metadata

Assignees

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions