Coding Practices SSW

The document outlines best practices for scientific computing and software engineering, emphasizing the importance of coding as a critical skill in research. It covers various management levels, including code, data, directory, and project management, while promoting modular programming, documentation, and collaboration. The conclusion highlights the significance of adhering to software engineering practices to enhance efficiency and user-friendliness in software development.

Uploaded by

Rohit Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

16 views47 pages

Coding Practices SSW

Uploaded by

Rohit Singh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 47

Numerical Software

Engineering 101/201
Scientific Software Club 2/13/17
Papers
● Best Practices for Scientific Computing, Wilson et al.
● Good Enough Practices in Scientific Computing, Wilson et al.
● Barely Sufficient Software Engineering: 10 Practices to Improve your CSE
Software, Heroux and Willenbring
Misconception: Coding is unimportant! It’s not like I’m a software
engineer...

(The crucial part is getting the numerical algorithm, proper data, good results, etc)
The (Relative) Truth: Coding is an important part of research and
a skill that takes years to hone

(Teach Yourself Programming in Ten Years by Peter Norvig)

Topics
● Code Level Management
● Data Management
● Directory Level Management
● Project Level Management
● Working with Others
● Documentation and Technical Writing
Code Level Management
Comment Succinctly (Design, Not Mechanism)
double AreaRectangle(double x, double y){
/* AreaRectangle calculates the area of a
rectangle with dimensions x and y */

/* Return -1 if bad input*/

if(x < 0 || y < 0){
printf(“x and y must be positive numbers”);
return -1;
}
/* Return the product of x and y */
return x*y;
}
Comment Succinctly
/*
runN4SID runs the system identification algorithm n4sid

~~~~INPUT~~~~
data: N x K time domain signal, N = number samples, K = dimension of data
p: includes measurement frequency in Hz, model size to fit
~~~~~~~~~~~~~

~~~OUTPUT~~~
Fitted system model, saved in results folder as system.csv
~~~~~~~~~~~~~
*/
void runN4SID(double data, params p){
…
}
Name Intelligently
● Fits in with earlier example, but having descriptive function and variables
is extremely important
● A headache for numerical calculations
○ Generally, code might be ugly, but make sure function is named well!
Name Variables Intelligently
void calcStuff(...){
A = getMatrix(...);
[U, D, V] = svd(A);
[X, Y] = getData(...);
[E, Z] = eig(X*A*Y);
w = getWeights...();
[S, N] = sumEV(W, w);
B = convolveMatrix(A, N, S)
I = [ identity(N); identity(N)];
C = I*B + I*A;
[Q, R, P] = qr(C);
….
(you get the point)
}
class Central2D { float& fx2(int ix, int iy) { return fx2_[offset(ix,iy)]; }
public: float& fx3(int ix, int iy) { return fx3_[offset(ix,iy)]; }
Central2D(float w, float h, // Domain width / height float& gy1(int ix, int iy) { return gy1_[offset(ix,iy)]; } // y differences of g
int nx, int ny, // Number of cells in x/y (without ghosts) float& gy2(int ix, int iy) { return gy2_[offset(ix,iy)]; }
float cfl = 0.45) : // Max allowed CFL number float& gy3(int ix, int iy) { return gy3_[offset(ix,iy)]; }
nx(nx), ny(ny), float& v1(int ix, int iy) {return v1_[offset(ix,iy)]; } // Solution values at next
nx_all(nx + 2*nghost), float& v2(int ix, int iy) {return v2_[offset(ix,iy)]; }
ny_all(ny + 2*nghost), float& v3(int ix, int iy) {return v3_[offset(ix,iy)]; }
dx(w/nx), dy(h/ny),
cfl(cfl) {} // Diagnostics
void solution_check();
static constexpr int nghost = 3; // Number of ghost cells // Array size accessors
const int nx, ny; // Number of (non-ghost) cells in x/y int xsize() const { return nx; }
const int nx_all, ny_all; // Total cells in x/y (including ghost) int ysize() const { return ny; }
const float dx, dy; // Cell size in x/y
const float cfl; // Allowed CFL number // Read / write elements of simulation state
// Array accessor functions float& operator()(int i, int j) {
int offset(int ix, int iy) const { return iy*nx_all+ix; } return u1_[offset(i,j)];
}
float& u1(int ix, int iy) { return u1_[offset(ix,iy)]; } // Solution values
float& u2(int ix, int iy) { return u2_[offset(ix,iy)]; } const float& operator()(int i, int j) const {
float& u3(int ix, int iy) { return u3_[offset(ix,iy)]; } return u1_[offset(i,j)];
float& f1(int ix, int iy) { return f1_[offset(ix,iy)]; } // Fluxes in x }
float& f2(int ix, int iy) { return f2_[offset(ix,iy)]; } // Wrapped accessor (periodic BC)
float& f3(int ix, int iy) { return f3_[offset(ix,iy)]; } int ioffset(int ix, int iy) {
float& g1(int ix, int iy) { return g1_[offset(ix,iy)]; } // Fluxes in y return offset( (ix+nx-nghost) % nx + nghost,
float& g2(int ix, int iy) { return g2_[offset(ix,iy)]; } (iy+ny-nghost) % ny + nghost );
float& g3(int ix, int iy) { return g3_[offset(ix,iy)]; } }
float& ux1(int ix, int iy) { return ux1_[offset(ix,iy)]; } // x differences of u
float& ux2(int ix, int iy) { return ux2_[offset(ix,iy)]; } float& uwrap1(int ix, int iy) { return u1_[ioffset(ix,iy)]; }
float& ux3(int ix, int iy) { return ux3_[offset(ix,iy)]; } float& uwrap2(int ix, int iy) { return u2_[ioffset(ix,iy)]; }
float& uy1(int ix, int iy) { return uy1_[offset(ix,iy)]; } // y differences of u float& uwrap3(int ix, int iy) { return u3_[ioffset(ix,iy)]; }
float& uy2(int ix, int iy) { return uy2_[offset(ix,iy)]; }
float& uy3(int ix, int iy) { return uy3_[offset(ix,iy)]; }
float& fx1(int ix, int iy) { return fx1_[offset(ix,iy)]; } // x differences of f void run(float tfinal);
// Call f(Uxy, x, y) at each cell center to set initial conditions
Decompose Programs into Functions
● Try to keep functions short
● Modularity makes code base more flexible, more easily modifiable
● Saves lines of code
● Practically speaking, humans can only remember a few things at a time!
Decomposing Programs into Functions
void calcStuff(...){ void calcStuff(...){
Node root; Node root;
… …
Node data; Node data;
… …
bool checkchild = 0; bool checkchild = isChild(root, data);
for(i = 0; i < root.numchildren; i++){ ...
if(root.child[i] == data){ }
checkchild = 1;
}
}
...
}
Eliminate Duplication
double calcValues(...){ double calcValues(... , bool Filter){
… …
X = getvalue(...); X = getvalue(...);
return X; if( Filter == true){
} X = filter(X);
VS }
double calcValuesFilter(...){ return x;
… }
X = getvalue(...);
X = filter(X);
return X;
}
Keep Semantics Consistent
void scaleVec(vec v, double n){ void scaleMatrix(double n, matrix m){
... ...
} }

void filterEigenVecs(Matrix M){ void filterEigVals(Matrix M){

... VS ...
} }

void find_all_keys(keys K){ void findAllKeyrings(rings R){

... ...
} }
Use Data Structures (If necessary)
void doStuff(... void doStuff(metatdata d){
double timestep, int size... …
date d, int dimx, int dimy… }
int numthreads){
... class metadata{
} VS double timestep;
int size;
date d;
int dimx;
int dimy;
int numthreads;
}
Incremental Changes
● Emphasized in two papers
○ Decompose a large task into small components
○ Test the correctness of components
● Programmers are most productive working in small steps
○ + Course Correction
Defensive Programming
● Assert (or Try/Catch)
● Unit Testing
○ What if no “useful” unit tests?
○ Numeric Unit Tests
● Automated Testing and Continuous Integration
○ (to be covered in the future)
Abstractions
● Computer Systems Researchers often talk about getting the right
“abstractions”
○ “Abstraction” decrease the complexity of your software by making the low-level details
hidden from the user
● Defining a convenient way to interact with your code base is hard!
○ Takes practice… cannot be quantified
○ What do you expose to the user (one of which will surely be yourself)?
Data Level Management
Save Raw and Intermediate Data
● Raw data D >> Intermediate Forms >> Result (yes or no)
○ You don’t just want to save the yes/no!
● Save Raw and Intermediate Forms
○ Saves time, extra processing, etc
Format Data Well
● Create data you wish to see in the world
○ Neatly labeled columns, information on format, etc
○ Important, especially if your data format changes down the road
● Space is cheap!
○ One variable per column, one observation per row, etc
○ Don’t cram!
Manage Your Metadata
● What is “Metadata”?
○ In short: Data about Data Set
● Might include date produced, units, etc
● You’ll need it later!
Publish Data
● (If you think others might want to use it)
● “Your data is as much a product of your research as the papers you write”
● Figshare, Dryad, Zenodo
Directory Level Management
Directory Names
● Your project should NOT be named “foo” or “a”
● Subdirectories should also be descriptive
○ Documentation in “docs”
○ Source in “src”
○ Scripts in “bin”
○ Etc…
● Should include a “data” and “results” folder
○ Make a distinction between what goes in each folder, as your results will surely contain
data!
○ Idea: every output goes in “results”, every input goes in “data”
Directory Names
❏ README
❏ LICENSE
❏ Tests
❏ testSightings.py
❏ data
❏ birdcount.csv
❏ doc
❏ notebook.md
❏ changelog.txt
❏ results
❏ summarized-results.csv
❏ src
❏ Sightings.py
Subdirectories (Don’t make too many)
❏ src
❏ helpers
❏ datastructs
❏ graph
❏ graphsearch
❏ methods
❏ dfs.py
Don’t Repeat Previous Work
● Use external libraries as much as possible
○ Optimized code and saves development time
● Use google, github, cppreference, etc
Project Level Management
Version Control
● Discussed Earlier This Semester
● Git, CVS, Mercurial, etc
○ Git preferred (Github, Bitbucket)
● Commit often, Commit early
● Don’t add large data dumps/files!
○ Makes version control slow, impractical
○ We will discuss later in semester how to manage this stuff
Adding Features, Refactoring
● Add features incrementally
○ Constantly check correctness
○ Don’t expect to add 1k+ lines and have your code work the first time
● Refactoring is a natural part of coding
○ Don’t avoid it
○ End up with bloated code
To use an IDE or not to use an IDE...
● I’m not sure!
○ What if like Microsoft Visual Studio, Eclipse, PyCharm?
○ Problem: code should be accessible to everyone
○ Getting libraries integrated into an IDE can be painful
■ For numeric libraries, even more annoying
■ Software makes this easier e.g. Intel Parallel Studio XE, Nividia NSIGHT, etc
○ If you’re prototyping and know IDE’s debugging and profiling tools well, why not
○ Mismatch between IDE environment and deployment environment
Issue-Tracking Software
● Common Mistake
○ “I need to refactor A, B, C and debug I, J, K
○ (One seminar and one nap later) “What was I supposed to do again?”
● Many out there (Wikipedia lists ~ 50)
○ Bugzilla, Apache Bloodhound, Planbox, etc etc
Working with Others
Industry vs Academia
● In industry, a group of experienced engineers is often assigned to manage
a single piece of software
● In academia, a single person might manage multiple pieces of software
Getting a Second Look
● Just as research ideas need a second look, so does a potential code base
● Pair Programming is extremely beneficial
○ Could be a problem if you’re the only one working on a project
● Coding with others ultimately makes you a better programmer
Documentation and Technical Writing
Create Barely Sufficient Documentation
● Somewhat covered earlier last semester
○ Documentation generation via Sphinx, Doxygen, etc
● You are writing the documentation for yourself as well as others!
Document All Work You’ve Done
● Not just the code you plan to release; code you’ve written but not used,
ideas you’ve tried (both successful and unsuccessful), etc
Reports and Papers
● Writing a paper or technical report? Put it under version control as well
● Formal Approach: Treat paper/report writing as programming.
● Save you time and effort town the road
Figures
● One script per figure
● Don’t manually change parameters; input them into functions
● Automation
○ Don’t be tempted to manually adjust window size and click the “save as” button in
MATLAB
Conclusions
Conclusions: Takeaways
● Following software engineering best practices saves development time,
headaches, and user-friendliness
● Developing (and maintaining) software is hard!
Conclusions: Questions
● Why put in all this effort if no one else is going to use my code?
● Considering the time spent improving non-essential parts of my code, will
the time saved from following best practices be greater than the extra
development time invested?

Liz Asset
No ratings yet
Liz Asset
29 pages
CG Lab Assignement
No ratings yet
CG Lab Assignement
17 pages
Web GPU
0% (1)
Web GPU
40 pages
Team Reference Document: Sharif University of Technology - Mkay
No ratings yet
Team Reference Document: Sharif University of Technology - Mkay
25 pages
Unit 2 Basic Optimization Techniques For Serial Code
No ratings yet
Unit 2 Basic Optimization Techniques For Serial Code
31 pages
CG
No ratings yet
CG
28 pages
Steady State Stability
No ratings yet
Steady State Stability
10 pages
BME303 Lab6 NinaSawaf
No ratings yet
BME303 Lab6 NinaSawaf
15 pages
HPC Unit 5 B
No ratings yet
HPC Unit 5 B
31 pages
OOP Midterm Exam Guide
No ratings yet
OOP Midterm Exam Guide
4 pages
Reflection of 2D Objects in Computer Graphics
No ratings yet
Reflection of 2D Objects in Computer Graphics
9 pages
C++ Connected Component Algorithms
No ratings yet
C++ Connected Component Algorithms
7 pages
KACTL Competitive Programming Guide
No ratings yet
KACTL Competitive Programming Guide
23 pages
Leetcode Pro Sheet
No ratings yet
Leetcode Pro Sheet
48 pages
Bisection in 1D, 2D, and 3D
No ratings yet
Bisection in 1D, 2D, and 3D
7 pages
Bsee21036 Oop Lab 13
No ratings yet
Bsee21036 Oop Lab 13
6 pages
Ai Lab Manual
No ratings yet
Ai Lab Manual
78 pages
FDMcode
No ratings yet
FDMcode
9 pages
Nimesh 1
No ratings yet
Nimesh 1
31 pages
Rahul Assessment
No ratings yet
Rahul Assessment
8 pages
Icpc Reference
100% (1)
Icpc Reference
30 pages
Samarth CG
No ratings yet
Samarth CG
30 pages
Ufc User Manual
No ratings yet
Ufc User Manual
131 pages
Cubic Spline Interpolation for 3D Structures
No ratings yet
Cubic Spline Interpolation for 3D Structures
44 pages
Vector 2 D
No ratings yet
Vector 2 D
155 pages
Kactl
No ratings yet
Kactl
15 pages
Mathworks Interview Questions
100% (1)
Mathworks Interview Questions
5 pages
Dipesh File PDF
No ratings yet
Dipesh File PDF
44 pages
Computational Physics Problem Solving With Compute
No ratings yet
Computational Physics Problem Solving With Compute
11 pages
CPP Labpgm 1 - 12
No ratings yet
CPP Labpgm 1 - 12
20 pages
Image Processing for Object Tracking
No ratings yet
Image Processing for Object Tracking
18 pages
Ufc User Manual
No ratings yet
Ufc User Manual
141 pages
CG All Programs Upd
No ratings yet
CG All Programs Upd
31 pages
Basics of Data Structures in C
No ratings yet
Basics of Data Structures in C
26 pages
University of Calgary Team Reference Document: March 15, 2017
No ratings yet
University of Calgary Team Reference Document: March 15, 2017
23 pages
University of Calgary Team Reference Document: March 15, 2017
No ratings yet
University of Calgary Team Reference Document: March 15, 2017
23 pages
C 3 Edu
No ratings yet
C 3 Edu
5 pages
Lab Assignment CG Ts
No ratings yet
Lab Assignment CG Ts
10 pages
Using SVA For Scoreboarding and Testbench Design: Ben Cohen
No ratings yet
Using SVA For Scoreboarding and Testbench Design: Ben Cohen
4 pages
SystemVerilog Assertions Guide
No ratings yet
SystemVerilog Assertions Guide
4 pages
Graphics
No ratings yet
Graphics
26 pages
OOPs Practical Final
No ratings yet
OOPs Practical Final
27 pages
Do Me Folling Things in CPP Language: 1. Real-Life Examples of OOP Concepts
No ratings yet
Do Me Folling Things in CPP Language: 1. Real-Life Examples of OOP Concepts
17 pages
Lab Report 1
No ratings yet
Lab Report 1
64 pages
Fortran Numerical Modelling Guide
No ratings yet
Fortran Numerical Modelling Guide
53 pages
An Introduction To Programming in Matlab
No ratings yet
An Introduction To Programming in Matlab
12 pages
CG Lab Manual Comp
No ratings yet
CG Lab Manual Comp
27 pages
Numerical Methods for PDEs Guide
No ratings yet
Numerical Methods for PDEs Guide
38 pages
Ada Lab Manual 2022 Scheme
No ratings yet
Ada Lab Manual 2022 Scheme
28 pages
Fast Fourier Transform with OpenMP
No ratings yet
Fast Fourier Transform with OpenMP
11 pages
Matlab Tips
No ratings yet
Matlab Tips
14 pages
Computational Lab Report
No ratings yet
Computational Lab Report
17 pages
Nektar++ Developer's Guide
No ratings yet
Nektar++ Developer's Guide
166 pages
CG Lab File
No ratings yet
CG Lab File
50 pages
Exam
No ratings yet
Exam
16 pages
CG 6-9
No ratings yet
CG 6-9
15 pages
Cramer's Rule and Function Plotting in MATLAB
No ratings yet
Cramer's Rule and Function Plotting in MATLAB
12 pages
Lec21-Convex-Hull-Convex Hull-5
No ratings yet
Lec21-Convex-Hull-Convex Hull-5
11 pages
Sample 1727540025592
No ratings yet
Sample 1727540025592
1 page
HOUSE
No ratings yet
HOUSE
6 pages
File 0150
No ratings yet
File 0150
1 page
File 0302
No ratings yet
File 0302
13 pages
Asipaygov Q W7yyq5p3vshavoohxb3w5ctpw6naxero
No ratings yet
Asipaygov Q W7yyq5p3vshavoohxb3w5ctpw6naxero
2 pages
Icpe Notes Chapter 1
No ratings yet
Icpe Notes Chapter 1
23 pages
Maths Concept King by Gagan Pratap Sir Kocxhii PDF Convert Compress
No ratings yet
Maths Concept King by Gagan Pratap Sir Kocxhii PDF Convert Compress
1 page
Sana MCQ
No ratings yet
Sana MCQ
2 pages
If M Controller
No ratings yet
If M Controller
280 pages
Trees and Optimization Techniques
No ratings yet
Trees and Optimization Techniques
13 pages
Upgrade Guide PDF
No ratings yet
Upgrade Guide PDF
5 pages
Forensic Footwear Analysis Guide
No ratings yet
Forensic Footwear Analysis Guide
21 pages
Attendance Management System
100% (1)
Attendance Management System
53 pages
Grade 5 Mathematics Textbook PDF
50% (4)
Grade 5 Mathematics Textbook PDF
2 pages
Create External Buttons For Your Keyboard
No ratings yet
Create External Buttons For Your Keyboard
12 pages
Basic MIPS Instructions
No ratings yet
Basic MIPS Instructions
5 pages
Digsi 5 QN0016
No ratings yet
Digsi 5 QN0016
2 pages
Telemarketer Registration Process Guide
100% (2)
Telemarketer Registration Process Guide
3 pages
Open Electives R20!15!16
No ratings yet
Open Electives R20!15!16
2 pages
Top Data Center Companies in India
33% (3)
Top Data Center Companies in India
47 pages
Rogue Code
No ratings yet
Rogue Code
4 pages
Jeevansathi.com Membership Overview
No ratings yet
Jeevansathi.com Membership Overview
42 pages
Computer Architecture Insights
100% (1)
Computer Architecture Insights
55 pages
Organizational Feasibility
No ratings yet
Organizational Feasibility
2 pages
G6 - Introduction EBAW
No ratings yet
G6 - Introduction EBAW
45 pages
Grade 9 Equation Solving Guide
No ratings yet
Grade 9 Equation Solving Guide
55 pages
Understanding C# Data Types
No ratings yet
Understanding C# Data Types
44 pages
A3 Worksheet - Caesar - S Cipher - Mini-Project
No ratings yet
A3 Worksheet - Caesar - S Cipher - Mini-Project
5 pages
SonicWALL OWA Setup for Exchange Server
No ratings yet
SonicWALL OWA Setup for Exchange Server
4 pages
Assignment 1
0% (1)
Assignment 1
5 pages
An Introduction and Applications of DOI
No ratings yet
An Introduction and Applications of DOI
38 pages
Understand WorkFlow in Detail
No ratings yet
Understand WorkFlow in Detail
118 pages
B.R.A.C.T.'s Vishwakarma Institute of Information Tech
No ratings yet
B.R.A.C.T.'s Vishwakarma Institute of Information Tech
9 pages
Curriculum Computer Science 2011
No ratings yet
Curriculum Computer Science 2011
310 pages
Isothermal CSTR Startup Analysis
No ratings yet
Isothermal CSTR Startup Analysis
3 pages
Multi-Attribute Attitude Model Guide
No ratings yet
Multi-Attribute Attitude Model Guide
10 pages
SCWCD
100% (1)
SCWCD
156 pages

Coding Practices SSW

Uploaded by

Coding Practices SSW

Uploaded by

Numerical Software

(Teach Yourself Programming in Ten Years by Peter Norvig)

/* Return -1 if bad input*/

void filterEigenVecs(Matrix M){ void filterEigVals(Matrix M){

void find_all_keys(keys K){ void findAllKeyrings(rings R){

You might also like