If I have seen further it is by standing on the shoulders of (tiny) giants Sir Isaac Newton
custom-built molecular-beam epitaxy (MBE) machine builds
almost flawless 100 nanometer thick atomic layers.
Source: http://www.bnl.gov/energy/ces/news.asp?a=1447&t=pr
Often my projects look and feel like this machine:
* too complex
* too complicated
* too hard to maintain or improve
* really really expensive both in terms of time and money
Nodejs is an interesting platform where an alternative to
single project that does everything is possible via
a very nice dependency mechanism is actually working in practice.
Dr. Gleb Bahmutov, PhD
Kensho, ex- lots of companies. HIRING
Win32 -> C/C++/Java/C#/CoffeeScript/JavaScript -> Node/Browser
~ 60 modules on NPM registry, 9 plugins for Grunt
- Problem, red flags
- Minimize complexity by separation
- Nodejs and npm
- Splitting large projects
- Keeping dependencies up to date using custom tools
As the project grows, look for red flags signalling that it has to be
refactored and split into smaller units
- build time > 15 seconds
- inability to unit test a specific feature
- two or more programming languages or environments:
DB + API, DB + API + front-end + worker machines
- using AND when describing the goals
Not red flags: size of the code base, number of lines,
number of people, software methodology.
- From clean to running > 10 minutes
- Insignificance
- Cannot reuse code somewhere else
In general, large projects are never finished and monolithic Node does not work well
- I have been to teams where setting up tools took days.
jQuery checkout and build takes 60 - 90 seconds!
- Hard to increase quality, since every positive change feels insignificant
- We have 10k of lines of code, lets get better return on investment by reusing them.
All software code suffers from interaction complexity
Each item can interact with every other item in the project.
var sum = a + b;
In a source file with 3 lines/variables/files 3 things interact.
As the number of lines increases, each item can interact with other items.
There is no physical separation, aside from folder structure, which
is easily broken.
Total number of communication links grows geometrically n*(n-1)/2, soon overwhelming
the mental capacity of any team. Thus the project calcifies, and new things
are harder to add without breaking the existing functionality.
- functions separate variables via scope
- source files separate code
- teams separate people
- dependencies separate files
We use physical separation all the time: a function is
language enforced scope separation for example.
Physically splitting the project into manageable chunks
cuts the number of ways different parts can interact.
Try very hard to have a graph without loops (single root tree, with root being
the ultimate application)
// function signature
function add(a, b) { body ... }
// package.json
{
"name": "my-utils",
"main": "index.js"
"dependencies": {
"module-a": "1.0.0",
"module-b": "0.1.0"
}
}
Same approach to package as with functions.
Give good name (=function name), declare what is needs (dependencies = arguments),
and Node will know how to require('my-utils') by loading index.js
var a = require('another-module');
// index.js
if (module.parent) {
module.exports = ...
} else {
// work as CLI
}
var a = require(
'./node_modules/another-module/src/something.js');
- Single repo per app
- Shared code via separate repos via dependencies
- Separate version control / version numbers
Makes it very easy to split a project into individual packages. The dependency mechanism is managed by Node Package Manager.
- It is an ordinary package itself npm
sudo npm update -g npm
Comes with nodejs install, generally works.
Most problems recently due the public registry reliability.
The number of listed and stored (as tar zip archives) packages
reached 63k.
You can use different versions of same module
// dependencies-resolution uses [email protected], [email protected]
// [email protected] uses [email protected]
$ npm list
[email protected]
├── [email protected]
└─┬ [email protected]
└── [email protected]
NPM public registry keeps copies of entire package. Bower registry only keeps urls to git repos + the author selects which files to include.
You can still exclude certain files using .npmignore file.
npm registry suffers from outages.
NPM tar archives include all the code, while bower is mostly
used to distribute final front end libraries.
I need ANSI colors in my terminal
npm search colors
npm info chalk
npm home chalk
npm install chalk --save
// index.js
require('chalk');
console.log( chalk.blue('Hello world!') );
- Starting a module
- Replace parts with 3rd party
- Open source parts
- Clone 3rd party into private dependencies
npm init
// answer questions
(optional) set "private": true
- Use grunt-nice-package
- Look at yeoman, ninit
Do not reinvent the wheel. Search for a module that does what you need,
is updated frequently and has good readme and tests.
Common examples: logging, async/callback handling
- https://www.npmjs.org/ search by keyword
- http://www.jsdb.io/, http://www.javascriptoo.com/
- http://microjs.com/
- http://www.echojs.com/
- http://trendingjs.com/
jsdb displays average time between commits, number of active committers, CDN urls
javascriptoo has nice demos for every project
- http://www.echojs.com/, http://dailyjs.com/
- newsletters
- Look at what other packages use.
- License (look for MIT/BSD)
- Use node-license-sniffer or licensing
- Offline NPM cache (for CI agents for example)
npm config get cache
- Run private registry or proxy
Second great step when splitting a large project - open source
parts that are not your primary business.
The argument is economic: you will not have to maintain
source control (github), continuous build (Travis), code coverage (coveralls.io),
browsers (saucelabs). Plus you get actual people trying to use your module, hopefully
giving you feedback or even code contributions.
tldrlegal.com explains each license in plain english, very well structured website.
- Higher quality (next slide)
- Economic: use free infrastructure
- Human: happier employees, more visibility
Fragile = breaks quickly under stress: porcelain mug, glass statues, modern financial sector.
Antifragile = gets stronger under stress: nature, winning team during playoffs.
Open sourcing exposes package to low-level stress (code review, unit testing, reuse), making it more likely to withstand high stress (unexpected production issue). For more info see book review
One key trait of large projects: they are very brittle. Most parts
never were reused or tested under different circumstances, and are shielded from wide code review,
making them likely to deteriorate in quality over time.
- Setup tests and jshint using Travis-ci
- Use status badges
- Generate README using grunt-readme
You have only a few seconds to grab attention.
Please provide a good short package description: multiple drafts are ok!
Use badges to show if module is tested and up to date.
Describe a typical problem and give an example.
Same principles apply to internal modules: you do not know who
is going to use your module and what their level of expertise is.
They could be a super expert or could be a novice. Give them enough information
to decide if what you are offering fits their needs. They will do same to you.
Please be kind to your audience.
Badges:
https://nodei.co/ - NPM info
travis - build info
coveralls - code coverage info
david-dm.org - dependencies
- Use semantic versioning
major.minor.patch
- Cannot
npm publish -f
anymore!
- Cannot
- Answer opened issues quickly
- Describe how your project is different
- Use README.md, avoid wiki
If you project is only used internally you can use whatever version names:
sprint numbers, week numbers, dates, seasons. Public modules should follow
semantiv versioning
0.1.0 < 0.1.x < 0.2.0 < 1.0.0-release < 1.0.0
Markdown captured majority of the docs share at Wiki's expense.
git fork public project to your private Git server
in package.json
"dependencies": {
"name": "git:your server url/name # tag or commit"
}
The most powerful 30-second argument for using git + nodejs.
You DO NOT have to run private registry to use nodejs dependencies.
Just point package.json at git repos directly and use SHA-1 commit ids,
or tags instead of versions. NPM can use anything that resolved to
a folder with package.json file as a target!
All the benefits of using versioned dependencies without any complexity.
Are we replacing the simplicity and consistency of a single project with a dependency mess?
Yes.
- the mess is manageable.
- the mess maps nicely to the software development:
different parts are developed at different speeds.
- using versioned dependencies isolates the true mess: constant merging of commits
In a single repo model, I could have just fixed the bug.
In a single repo, because it lacks isolation, you might fix 1 bug, but it might interact with other parts and introduce 10 new bugs.
This is because you are breaking the rule: when fixing a problem, change a single part.
In a single repo, all parts are moving and interacting.
Solving 1 or 2 interacting parts is easy.
Solving 3-body in space problem is still a very interesting
physics question http://en.wikipedia.org/wiki/Three-body_problem
I could branch single repo and fix one bug
You can easily clone and branch individual dependencies too.
- coding
- refactoring
- integration
A general programming advice is to separate coding (writing new features, testing)
from refactoring. Having separate physical links among modules makes possible
to separate 3rd step: integration or version update.
// package.json
{
"dependencies": {
"module-a": "1.0.0"
}
}
// node_modules/module-a/package.json
{
"version": "0.8.0"
}
// npm registry:
module-a: 0.8.0, 0.9.0, 1.0.0, 2.0.0
There are two problems in the above situation: a small and a large one.
Small problem: declared version of module-a in package.json
differs from installed version in node_modules/
Large problem: module-a version is behind, there are multiple new
releases available
package.json dependencies get out of sync with
node_modules
Use deps-ok and
grunt-deps-ok.
Much faster than npm outdated
because only checks top level
dependencies` version numbers.
Package.json is stored in source control, and if someone updated
dependency version declared there, other people sync the package.json
but do not know if node_modules is outdated. Add grunt-deps-ok as
first step to your grunt default pipeline and they will get a nice
error message to run `npm install`. Also supports bower dependencies
10 top level dependencies.
- Which ones have new releases?
- Each project can iterate very fast
- Which ones I can upgrade without breaking my stuff?
- Can the upgrade process be automated?
Relevant Slashdot "A Call For Rollbacks To Previous Versions of Software"
Each project is constantly getting out of date.
The dependencies keep releasing versions, but you have not yet ugraded.
This is a good feature: it protects you.
Rule of thumb: using 1 release behind the latest to great benefits, but
let bugs in the latest release be ironed out.
See all outdated modules (recursively)
npm outdated
or npm outdated --parseable|wc -l
Try `npm outdated` on any of your projects to see which
dependencies are out of date.
Most of these lines are meaningless - we cannot control dependencies deep down the
chain. We only care about top level dependencies WE specified
npm ls
- shows the dependency tree
npm ls --parseable
- shows the dependency paths
npm ls --parseable|sed 's/.*\/\(.*\)/\1/g'|sort|uniq|wc -l
- counts number of unique dependency modules
slides-now depends on 302 modules
gt depends on 187 modules
next-update depends on 176 modules
coffee-script only depends on 2 immediate dependencies (mkdirp and docco), but this is not entirely true.
This a true testament to the Nodejs power - a lot of people who have never
met and do not communicate regularly built lots of projects by each contributing a small part.
Great tool to see your dependencies, including badges https://david-dm.org/.
You can drill down into dependencies.
coffee-script production lists only 5 dependencies,
but there are more dev dependencies that go into making coffee-script
https://david-dm.org/jashkenas/coffee-script#info=dependencies&view=tree
https://david-dm.org/jashkenas/coffee-script#info=devDependencies&view=tree
about 40 dependencies total
Rule 1: you control only the top level dependencies.
Rule 2: delaying the upgrade makes it harder
Can I update dependency A from 1.0.0 to 1.1.0?
- Fetches available versions
- Installs each version, run tests
- Reports successful updates
YOUR PROJECT MUST HAVE TESTS
If I structure my application the way I suggest: a graph of dependencies,
can I update a dependency without breaking my project?
0.1.0 version change means minor update. Can I believe a person I have never
met not to break something important to ME?
- Install latest version separately (default)
- Install latest versions for all modules at once
- Install and test each version of each module
Am I likely to successfully update A from 1.0.0 to 1.1.0?
I might not have enough tests, and would like to know in general if module A's version 1.1.0 is compatible with version 1.0.0
In general, I do not trust the convention that patch version changes
do not break things. I also might be using a small subset of the features
and upgrade across major versions just fine
next-update-stats is a free public server collecting success / fail anonymous stats sent by next-update
next-update-stats server collects anonymous update success / fail stats for public modules.
Each time next-update tests A 1.0.0 -> 1.1.0 update it sends true / false to the server running
at http://next-update.herokuapp.com/
You can get JSON by using http://next-update.herokuapp.com/package/name or data for specific
version pair, for example http://next-update.herokuapp.com/package/lodash/1.0.0/2.4.0
Large number of projects getting out of date is a problem.
Example: http://glebbahmutov.com/status/ shows the
status for most of my projects in single dashboard
Can I automate next-update for all these projects?
Apple has switched the way updates are installed.
- iOS6 - pull mechanism
- iOS7 - push mechanism
iOS 6 asked you to update either each application or all of them.
iOS 7 has install updates by default mode. You have no assurance that
your data or settings are not going to be preserved, although you assume this
- Clone your git repos one by one
- If builds ok:
- Runs next-update
- If successful AND strong positive next-update-stats
- update dependency to latest
- push update to origin
next-update-failed - include in your project as dev dependency. Whenever someone uses next-update and it fails for your package, you will get detailed information.
The work only has started, but this seems like a good idea for enterprise customers.
I was inspired by lots of interesting errors reported by the actual customers when we
started using Sentry http://bahmutov.calepin.co/know-unknown-unknowns-with-sentry.html
I could have never guessed the variety of errors people experience, and our automated
and manual testing lets through.
- npm-developers
- Heroku: 10 Habits of a Happy Node Hacker
- Publishing a simple package to npm
- What has changed in new version?
Splitting a large project into smaller ones and assembling them makes it simpler to:
- understand, code and test
- complete things
- be happy
You are admitting that you are a human being: your mental capacity
has a hard limit: it can only keep 4-7 things in your immediate memory at a time.
Splitting things into dependencies allows to remove bunch of things from
your mind, and allows the concentration on immediate task at hand.
Splitting a large project makes staying up to date a problem.
next-update - done
next-update-stats - done
next-updater - in progress
next-update-failed - to be started
Any help will be appreciated, @bahmutov
Thank you
[slides-now-footer]: "deps like a boss" [slides-now-theme]: "full" [slides-now-timer]: "60"