Showing posts with label Software Engineering. Show all posts

10 April 2017

The Problem with Today’s Software Thought Leaders

Albert Einstein once said: "What is right is not always popular and what is popular is not always right." The current zeitgeist in software development has this exact same feel. Our lives as software developers have been shaped by the ideas of many people. Some of those ideas are right and some are popular. In time the wrong ideas will fade and the right ideas will dominate. The problem is: How do we make that determination in the present? What are the good ideas and what are the bad ideas?

As software developers our working days usually consist of two types of professional activities, one is obviously creating software (software development) or related technical tasks configuration management, troubleshooting, deployments, etc. and the other is being managed or other duties that relate to providing information (status) to managers (software development management). The software industry is very young and constantly changing. At any given moment there are dominant de facto standards that are followed for both software development and its management. Although it varies across industry sectors, many development practices and tooling have become standards, things like IDE’s, continuous integration, the use of tests and testing frameworks, etc. Some of these have been influenced by Agile and XP. On the management side the standard seems to be largely dominated by Agile like scrum and associated practices from XP.

The people who dominate our industry with their ideas could be described as our thought leaders. Wikipedia has the following definition for thought leader:

A thought leader can refer to an individual or firm that is recognized as an authority in a specialized field and whose expertise is sought and often rewarded.

The creation of Agile was done in a ceremonious fashion, depicted in an odd ritualistic looking and now famous photograph, by signatories to a manifesto. Let’s call these people the Agile Thought Leaders or just the Agilistas (since brevity is our thing). The Agilistas have come to have their ideas dominate the industry and in many cases dictate how we now spend our days.

It seems that Agile’s rise in popularity is directly proportional to my increasing skepticism about it. I find myself internalizing that skepticism into frustration and even feelings of anger towards it and the Agilistas. These feelings become exacerbated by the various incarnations of Agile created by legions of certified Agile consultants that roll into my life to tell me how to spend my days.

Now those feelings are not constructive and one really needs to step back and ask: Why is what I see in the industry wrong and how can I form and a constructive criticism of it?

I was lucky to come across Greg Wilsons’s talk: "What We Actually Know About Software Development, and Why We Believe It's True". This is a very enlightening talk which I highly recommend. In it he looks objectively at how we have arrived where we are and how we can possibly better move forward. I got the term Agilistas from his talk, although I don’t know if he coined it.

The talk touches on issues that I feel plague both my career and our industry. The problem manifests itself in how we make our decisions about what we do on projects both from the day to day and to the larger industry practices. Many of these decisions seem to be influenced all too often by people who claim some authority, sometimes it’s positional authority, sometimes it’s due to some form of gravitas and all too often right or wrong it boils down to someone’s opinion. To better illustrate this about our thought leaders let's look at Greg Wilson’s critique of one of Martin Fowler’s publications about DSL’s (Domain Specific Languages). The relevant quotes with time stamps are as follows:

11:26 [Martin Fowler] is a very deep and careful thinker about software and software design and last summer in IEEE Software he had an article on Domain Specific Languages, this notion that build a tiny language in which the solution to you original problem in easy to express. And in that he said using DSLs leads to improved programmer productivity and better communication with domain experts.
11:51 I want to show you what happened here. One of the smartest guys in our industry made two substantive claims of fact in an academic journal and there is not a single citation in the paper. Not a single foot note or reference. There is no data to backup this claim. I am not saying that Martin is wrong in his claims. I am saying is what we have here is a Scottish verdict. Courts in Scotland are allowed to return one of three verdicts, guilty, innocent, or not proven. Arguments in favor of DSLs, arguments in favor of functional programming languages making parallelism easier or Agile development leading to tenfold return on investment or whatever are unproven. It doesn’t mean they are or wrong, it doesn’t mean they are right, it means that we don’t know and we should be humble enough to admit that.
12:38: Carrying on in that same article "Debate still continues about how valuable DSLs are in practice. I believe that debate is hampered because not enough people know how to develop DSLs effectively." Crap I think the debate is hampered by low standards of proof.
12:55: Listening computer scientists argue, listening to software engineers in industry argue, it seems that the accepted standard of proof is I’ve two beers and there’s this anecdote about a tribe in new guinea from one of Scott Berkin’s books that seems to be vaguely applicable therefore I’m right. Well no, sorry that’s not proof and you should have higher standards than that.

I think these are excellent points. I disagree with Greg Wilson here and feel that Martin Fowler is not one of the smartest guys in our industry, although he maybe one of the most influential. I also question whether he can be considered a careful thinker. I think his approach like many of the other Agilistas is self serving and shows very little humility or willingness to be self critical. This is something that Bertrand Meyer commented on when reaching out to the Agilistas in conjunction with his book on Agile! The Good, the Hype and the Ugly that almost all of the Agilistas treated him with "radio silence" after receiving his draft. The problem is that the Agile business is just too lucrative for consulting and on the speaking circuit. A great example of this business model is "Uncle Bob" Martin’s infomercial-esque site promoting his wisdom for sale. I refuse to link to it, so google it, if you want to see that spectacle. All of these guys make a lot of money selling Agile and if they started being critical of it that would be a risk to their revenue streams. Also how many of these guys still do significant software development work? Ironically I think Uncle Bob still codes, but I am talking about opening up an IDE or text editor after checking out a big code base from version control and fixing bugs or doing actual coding or refactoring. Now maybe that’s not fair to ask that, or is it? After all they are driving how we spend our days being managed on software projects.

Another point of contention I have can be found on Martin Fowlers website under the FAQ section:

Why is your job title "Chief Scientist"?
It's a common title in the software world, often given to someone who has a public facing technical role. I enjoy the irony of it - after all I'm chief of nobody and don't do any science.

He openly admits he does no science (the black bolding is mine). Therein lays the problem with all of the Agilistas. There seems to be no scientific method involved in many if not all of their claims. These claims as we have seen are sometimes made in "academic" journals without any citations. Of course I am skeptical about much that is written about software in journals published by ACM or IEEE and of the organizations themselves. I think Greg Wilsons observation validates my skepticism and further draws attention to the lack of scientific method involved in software. We need better science and people who are intellectually honest about the fact that they are just expressing unproven and in some cases unfounded opinions.

It seems to me a more conscientious approach would be for the Agilistas to promote and fund studies. Use the scientific method to back up their claims and help move the industry forward. Whether they like it or not that is what will most likely eventually happen and they will either need to get on board or be left behind.

The irony here is that while we need more of a scientific method in defining software methodologies, much of our infrastructure comes from mathematics, another area that seems to be lacking with the Agilistas. For example: Noam Chomsky’s formalization of grammars were adapted by Jim Backus and further refined algebraically by Donald Knuth to give us modern compiler theory and compilers. There’s a long list of people who have laid down the mathematical foundations, people like Alan Turing, John Von Nueman, Alonzo Church, Stephen Cole Kleene, Haskell Curry and the list goes on. Unfortunately I think too many people in our industry are ignorant of our field’s true mathematical underpinnings. I have also encountered a large degree of anti-intellectualism from programmers when it comes to applying modern math to programming. Again I think this is an area where one should only ignore it at their own peril.

If you want to see the work that should and probably will drive our future I would recommend looking at what the researchers are doing, people like:

These are just a few. There are many people doing research that will eventually find its way into our industry. It has been happening this way all along. The problem with these people’s work is it requires real intellectual effort as opposed to the Agile approach which, while in some cases maybe common sense, all to often seems to come across as a bunch of platitudinous management speak.

I think we need to hold the Agilistas to higher standards of proof!

Disclaimer: I still code personal projects but have not been on a dev team doing real dev work in about 3-4 years after doing it for over 25 years. Actually from what I have seen of Agile management techniques and war room/open office layouts, I am in no hurry to do any of that any time soon.

References and Further Reading

26 September 2016

Software Development is a Risky Business

Software projects are inherently risky. Some statistics show that roughly seventy percent to fifty percent of software projects get classified as “challenged”. Where challenged can mean cost overruns and/or delivery delays, or it can mean complete failure and the cancellation of the project. Some statistics indicate as the size of the project increases so does the probability of problems and failure. Many articles and blog posts have been written about the long history of software project failure. At the time of writing this post the most recent and very public software project debacle was the healthcare.gov project. It seems as IT becomes more pervasive and essential the problems with projects may be getting worse not better.

In spite of these numbers and all too often in my experience I have found that individuals and organizations seem to have a very optimistic outlook about their own projects especially at the beginning of the project. Perhaps it’s the classic: “that won’t happen to me” attitude. Even as projects start to run into problems I have seen people especially managers not take issues seriously or ignore them all together. In some of the analysis of healthcare.gov, this was cited as one of the problems.

Risk management and risk assessment/analysis are developed areas in project management and engineering and are often discussed in software projects. Many risks will end up falling into two categories. One category is actual risks to the end product, such as unforeseen security vulnerabilities in software systems or other engineering issues that lead to flaws or critical failures of the system or components. The other is risks in the management and execution of constructing product or system.

Risk management is one of those areas where the engineering aspects and the engineering management aspects can become tightly coupled. An example can be derived from the XKCD comic “Tasks”: https://xkcd.com/1425/. Now the point of this comic is to demonstrate the difference between an easy problem and hard one, and presumably the conversation takes place between a manager and developer. In a risk setting, perhaps solving the hard problem is the goal of the project. So a risk might be that the research team fails to solve that problem at the end of the five years, or that it takes ten. So in a sense the task is high risk but there are team related risks, maybe the budget is too low and the team is subpar due to low salaries so they can never even solve it. Another possible scenario is that the conversation never happened and the project management and team think it is solvable in six months. This second scenario is the risk I have seen though out most of my career, the project management and team are overly optimistic or really don’t have a good understanding of the complexity of the task at hand. In some cases these types of issues seem to be due in part to pluralistic ignorance. Often in cases where developers don’t want to contradict management. There’s a great dramatization of this type of behavior in the show Silicon Valley when the Hooli developers are working on the Nucleus project, where the information is conveyed up the management chain but nobody wants to tell Gavin the truth.

Sadly it seems like the risks that are identified are almost peripheral to the real risks on projects. It is usually things like the hardware delivery will be delayed which will delay setting up our dev/test/prod environments and delay the start of development. While these are legitimate risks, there is always the risk of all of the unknowns that will cause delays and cost over runs, even things like developer turnover never seem to be taken into account, at least in my experience. Also there seems to no cognizance of the long term cost risks like the project was poorly managed or rushed or was done on the cheap which caused the delivered codebase to be an unmaintainable big ball of mud that will incur a much higher long term costs.

In order for software engineering to grow up and become a real engineering discipline it will need formal risk assessment and risk management methodologies both on the project management side and on the work product side. This is also an area where ideas and methodologies can potentially be taken from existing work in other engineering disciplines. These methodologies will also most likely draw from probability and game theory.

References and Further Reading

23 March 2015

What is Software Correctness?

Software correctness which is really software quality is not one thing. It is comprised of a number of different and sometimes conflicting attributes. As always Gerald J. Sussman provides interesting insights and in this case it is in his "We really don’t know how to compute" talk. He makes the point that "correctness" may not be the most important attribute of software, the example he give is if Google’s page craps out on you, you will probably try again and it will probably work. Actually this example might be more about reliability but it got me thinking as I have always thought that "software correctness" meant the combination of both accuracy and reliability and that was this the "correctness" that I always thought was the most important feature of software. But this correctness is really two software quality attributes and the Google search software exemplifies these two aspects of "software correctness", one is availability and the other is search accuracy or the fulfillment of the users expectations. One could argue that availability is not a software attribute but it is affected by the quality of the software and it is an attribute of a software system even if some of that might be driven by hardware. So these two attributes for Google search do not have to be perfect. The search quality needs to be at least better than the competition and the availability has to be high enough so that people won’t be discouraged from using it and for these two attributes they are both very high. These parameters of quality vary in other domains, a machine administering a dose of radiation to a cancer patient would need to have stringent requirements as to the correctness of the dose, and in fact there have been deaths due to exactly this type of problem. Another is software guiding a spacecraft or a Mars rover which needs to be extremely reliable and correct and when it hasn’t been, well there are a few space bricks floating around the solar system due to software errors. In the real world, especially now, software is everywhere and it now lives in a variety of contexts. So which quality attributes are important will vary based on its domain and its context.

So what makes software good? Google first conquered search with Map-Reduce and other processes running on custom commodity hardware that allowed for flexible fault tolerant distributed massive data crunching. Amazon has commoditized its server infrastructure to sell spare capacity that can be easily used by anyone. Apple created easy to use attractive interfaces for the iPhone and iPad that consumers loved with similar usability now being offered on Android. NASA has successfully landed on the frigid world of Titan, touched the edge of the heliosphere, and has run rovers on Mars. While there are obviously hardware components to these systems, it was the various types of software quality that played a pivotal role in these and the many other great software successes.

It is easy to look back at these successes. With the exception of NASA, I doubt there was a deliberate effort on which quality aspects to focus on. Most were probably chosen intuitively based on business needs. Software quality is essential to any organization that creates software. It is an established subdomain or perhaps a related separate sibling field of software development. It has consortiums, professional organizations and many books written on the subject. So I delved into some of these and you find mention of Agile and TDD. You also start to see things like CMMI, ISO 9000, Six Sigma. Some of the resources seem to be written in some alien language, perhaps called ISO900ese, that seems related to software development but leaves me feeling very disconnected from it. I think that the problem might be that these attempts to formalize software quality as a discipline are suffering from the same problem that plagues the attempt to formalize software engineering as a discipline. It conflates the development process with the final product and the software quality formalization might be further exacerbated by the influence of the domain of manufacturing and its jargon. I think these established methodologies have value, for example makes sense, but it seems like in general with these standards there is a lot to weed through and the good parts seem to be hard to pick out. So I thought it might be fruitful to look at the problem from another angle that might provide a better way to reason about the quality of a system not only after and during its development but also before development begins.

A list of "quality parameters" for software is a requirement for software engineering to be a real engineering discipline. These parameters could then be viewed qualitatively and quantitatively and perhaps even be measured. In engineering disciplines various parameters are tradeoffs, you can make something sturdier but you will probably raise the cost and or the weight, for example plastic bumpers on cars, metal would be more durable but would be heavier and more costly. If one were to think of some parameters in software as tradeoffs you can see that improvements on some parameters may cause others to go down. In my post "The Software Reuse Complexity Curve" I posit that reuse and complexity are interrelated and that the level of reuse can affect understandability and maintainability. Another example is performance, increasing performance of certain modules or functions might lead to an increase in complexity which without proper documentation could lead to harder understandability which might cause a higher long term maintenance cost.

So to come back to the question and to rephrase it: What are the attributes that constitute software quality? I am not the only person to ask this question as I have seen other blog posts that address this. There are quite a few people who have asked this question and have compiled good lists. One post by Peteris Krumins lists out some code quality attributes that Charles E. Leiserson, in his first lecture of MIT's Introduction to Algorithms, mentions as being more important than performance. Additionally Bryan Soliman’s post "Software Engineering Quality Aspects" provides a nice list with a number of references. I thought I’d create my own list derived from those sources and my own experience. I am categorizing mine in three groups, Requirements based (including nonfunctional requirements), structural, and items that fall into both categories.

REQUIREMENTS

Performance
Correctness (Requirements Conformance)
User Expectations
Functionality
Security
Reliability
Scalability
Availability
Financial Cost
Auditability

STRUCTURAL

Modularity
Component Replaceability
Recoverability
Good Descriptive Consistent Naming
Maintainability/ Changeability
Programmer's time
Extensibility/Modifiability
Reusability
Testability
Industry standard best practices (idiomatic code)
Effective library use, leveraging open source (minimal reinvention)
Proper Abstractions and Clear code flow
Good and easy to follow architecture/ Conceptual Integrity
Has no security vulnerabilities
Short low complexity routines

BOTH

Robustness/Recoverability
Usability
Adaptability
Efficiency
Interoperability/Integrate-ability
Standards
Durability
Complexity
Installability

This is by no means intended as a definitive list. The idea here is to put forth the idea of enumerating and classifying these attributes as a way to reason about, plan and remediate the quality of software projects, basically create a methodical way to address software quality throughout the software development lifecycle. In the case of planning software quality attributes can be potentially rated with respect to priority. This could also allow conflicting attributes to be more effectively evaluated.

The attribute of availability is one that is fairly easy to quantify and often gets addressed usually in some type of Service Level Aggreement (SLA). Although availability cannot be predicted, the underlying system that "provides" it can be tested. Netflix has developed chaos monkey to in part to verify (test) this quality attribute of their production system. Tests and QA testers are a way to verify the requirements correctness and the correctness of system level components and services contracts of software. Static analysis tools like PMD, Checkstyle, and Sonarqube can help with code complexity, code conventions and even naming.

Complexity is an elusive idea in this context it also crosses many of the above ideas. Code can be overly complex, as can UIs, Usability (think command line), Installability, etc. Another cross concept is understandability which can affect code, UIs, etc. which is in part related to complexity but can manifest itself in the use of bad or unconventional nomenclature as well. There may be other cross concerns and it seems that the spectrum of quality attributes might better reasoned about by separating out these concepts and developing general ways to reason about them.

I should note that I include financial cost as one of the attributes. While it might not be a real quality attribute, higher quality may necessitate higher cost, but not necessarily. If one were to develop two systems that had the same level of quality across all of the chosen attributes, but one system had twice the cost, the cheaper one would be a better value, hence more desirable. So cost is not a quality attribute per se, but it is always an important factor in software development.

In summary my vision is a comprehensive methodology where the quality attributes of a software system are defined and prioritized and hopefully measured. They are planned for and tracked throughout the life cycle and perhaps adjusted as new needs arise or are discovered. The existing tools would be used in a deliberate manner and perhaps as thinking about the quality attributes becomes more formalized new tools will be developed to cover areas that are not currently covered. It will no longer be about code coverage but quality coverage.

References and Further Reading

Anthony Ferrara, Beyond Clean Code, http://blog.ircmaxell.com/2013/11/beyond-clean-code.html

David Chappell, The Three Aspects of Software Quality: Functional, Structural, and Process, http://www.davidchappell.com/writing/white_papers/The_Three_Aspects_of_Software_Quality_v1.0-Chappell.pdf

Rikard Edgren, Henrik Emilsson and Martin Jansson , Software Quality Characteristics,

http://thetesteye.com/posters/TheTestEye_SoftwareQualityCharacteristics.pdf

Bryan Soliman, Software Engineering QualityAspects, http://bryansoliman.wordpress.com/2011/04/14/40/

Ben Moseley and Peter Marks , Out of the Tar Pit, http://shaffner.us/cs/papers/tarpit.pdf

Brian Foote and Joseph Yoder, Big Ball of Mud, http://laputan.org/mud/

Wikipedia , Software quality, http://en.wikipedia.org/wiki/Software_quality

Franco Martinig, Software Quality Attributes, http://blog.martinig.ch/quotes/software-quality-attributes/

MSDN, Chapter 16: Quality Attributes, http://msdn.microsoft.com/en-us/library/ee658094.aspx

Advoss, Software Quality Attributes, http://www.advoss.com/software-quality-attributes.html

Eric Jacobson, CRUSSPIC STMPL Reborn, http://www.testthisblog.com/2011/08/crusspic-stmpl-reborn.html

15 March 2014

Are Your Developers Treating Your Project as Their Own Personal Science Project?

Years ago I worked on a project that had a custom role based security system and for this system we needed a way to load menus according to the users’ assigned roles. Two of the dominant developers on the project decided that this was a perfect use for AJAX mainly because they wanted to learn and use AJAX. So they build a separate deployable menu application to serve up the menu by emitting XML from a Spring Controller URL, it was very comparable to a simple Restful service. It was called directly from JavaScript in the application’s web page. Since there would be a lot of calls for any given user’s menu they built a cache for each user’s menu options, it was a custom cache where objects never expired. This work was completed as I joined the project and as I got acquainted with this project my first thought was why do all this work? Why not just load the menu when the user is authenticated and then cache it into the HTTP Session, I raised the issue but was ignored. It was their pet project and it was sacrosanct.

Later, while testing the apps that used the AJAX menu call the testers ran into a problem, the cache was interfering with testing because it was keeping the database changes from be propagated and testing the applications created a need to continuously restart the AJAX menu app. So a new admin interface was added to the AJAX menu application to allow the testers to clear the cache. This created a whole separate application that the client would have to support just to enable this unnecessary AJAX menu delivery mechanism. So instead of just caching the data in the HTTP Session which is easier and any cache problem would be fixed by just logging out and logging back in, these two developers built a pet project which consists of an extra application and codebase that has to be understood and maintained. Ultimately the biggest problem this creates boils down to simple economics, it more expensive to maintain a solution that requires a whole separate application.

I think there is a tendency for all developers both good and bad to sometimes want to make projects more interesting and there can be a lot of temptations. It is often tempting to create a custom implementation of something that could be found already written because it’s a fun project or the temptation to shoehorn a new hot technology into a project or even the temptation to add a solution to a problem that doesn’t need to be solved or just strait up over engineering. I am guilty of some of these transgressions myself over the years.

My anecdote about the menu app is just one of at least a dozen of egregious examples of these types of software project blunders that I have seen over the years. My anecdote shows a manifestation of this problem and it also shows that these types of software project issues have a real tangible cost. They create more code and often more complex code. In some cases they create the need for specialized technical knowledge that wasn’t even necessary for the project. While these approaches may only result in a small cost increases during the construction of the software the real consequence will be the long term maintenance costs of a software system that was needlessly overcomplicated. In some cases these costs will be staffing costs but they can also be time costs in that changes and maintenance take considerably longer. These problems may also result in costs to business as unwieldy systems cause the loss of customers to competitors and can impede the ability to gain new customers.

These types of decisions can be viewed as software project blunders. In some cases the developers know better but choose to be selfish in that they put their own desires above the project, in these cases these decisions can potentially be viewed as unethical. The two developers from the anecdote were extreme examples and were pushing into unethical territory since they treated every new project as a personal opportunity to use some new technique or library or some custom idea that they were interested in. However, I think most of these blunders are simply that, mistakes in judgment or lack of good oversight. It is always easier in hindsight to see these problems, but when one is in the heat of the moment on the project one can lose perspective. The real challenge is trying to avoid these blunders. Really these are problems that fall into a larger spectrum of software project management problems and sadly I have mostly seen terrible software project management throughout my career.

I wish I could say I have a solution to this problem. I think developers have some responsibility here especially senior developers, well at least the good ones. Software development is really about decisions. Developers make many decisions every day including naming, project structure, tools, libraries, frameworks, languages, etc. Too often these decisions are made in isolation without any review and sometimes they can have serious long term consequences. When making so many decisions in such a complex domain mistakes are inevitable. Good software developers and good teams make an effort to deliberate and review their choices.

07 August 2012

Why You Can’t Have a Real Software Engineering Discipline

As we all know the fields of computer science and software engineering are in their infancies. Many a blog post has been written lamenting the fact that software engineering is not a real engineering discipline, and while I have not written that exact post, I have written about the subject and have deconstructed what others have said about it. A number of these points do apply to why software projects routinely fail, yet another topic that has received considerable attention. Now admittedly all engineering disciplines regardless of their maturity and formalization have project failures and creating a real software engineering discipline will not eliminate this problem but one would hope that it would abate it.

I find it odd that at a time when so many people are involved in IT there seems to be little discernible progress in creating a real software engineering disciple. There are probably millions of people working on creating software. Also there are many researchers trying to solve this problem from many different angles. Practitioners have been attempting to solve it with ideas like Agile and Software Craftsmanship, etc. Additionally there is a long list of failures surrounding the formalization software engineering.

My biggest complaint is the fact that there really are no good formal standards in general software engineering principles and methodologies lack a formal foundation and even the more vague principles can be easily thwarted and misused, and often, as I have previously complained, it often ends up being based on pure opinion which is usually won through positional authority, perseverance or by those who are just more politically savvy.

The intent of this post is not to attempt to solve or even offer solutions to the problem of creating a real software engineering discipline but to look at what I think are some barriers to creating this discipline and in some of these cases offer thoughts on overcoming those barriers.

Low Entry Requirements

If we are to think software engineering as an engineering discipline, I challenge you to find another engineering discipline that routinely has practitioners with no formal training in the field. I very much doubt that someone who did not hold a degree in engineering and who built a shed in his back yard is now working as a structural engineer on a construction project, the mere idea seems ludicrous. Yet I have worked with non technical degree holders who became software developers, one who started by building web pages and now calls himself an "Internet Application Architect", and is probably one of the biggest cargo cult coders I’ve ever had the misfortune to work with. This egalitarian aspect isn’t all bad as it does let good people into the field as well. Another non technical degree holder I once worked with became an accomplished developer and went on to get an advanced degree in CS and is now a CS professor. Although it’s probably the case that for every good non technical degree holder who joins the field it is likely that dozens of mediocre and bad practitioners will also join. It seems that in other engineering disciplines there are much more stringent educational requirements to ensure proper training. To be clear here I am not equating having a degree with being competent because I have met many incompetent people with degrees, but still you need something. I confess I am at a loss for a solution for this one.

The Lack of Differentiation between Science and Engineering

Chemistry is a scientific discipline, chemical engineering is an engineering discipline, and you can make this comparison between other engineering disciplines and the sciences that they employ e.g. electrical, mechanical, and structural map to various areas of physics. Engineering and scientific disciplines are different types of disciplines often taught in separate schools. Although each engineering discipline has some scientific overlap and it would probably be possible to move from the appropriate scientific field to the corresponding engineering field in general you probably would not do so without returning to school. In recent years software engineering curricula have been added to the rosters of many colleges and while this is potentially a step forward, discounting for the moment that software engineering is still not really real engineering, a software engineering degree and a computer science degree in many cases gets you the same job!

So in other fields there would be a differentiation between a degree that would put you on a track to do scientific research and one that would put you on a track to do engineering work. As I understand it, if you are fresh out of school with a BS in CS that qualifies you to be a tester at Microsoft¹, developers hired right out of school need at least a Masters degree in CS. This is an extreme case of how this really breaks down in our industry. In most cases engineering practitioners are the software engineering, MIS, CS or even non technical degrees and the few scientific careers generally go to the advanced degree holders. I admit this probably fairly normal a BS in chemistry or biology is also more likely to get you a job in IT than a job as a research scientist. Still compared to more mature engineering fields there seems to be a lack of real differentiation between the degrees that yield a career as a software engineer versus one as a computer scientist.

Bad Management

The entry requirements for managers in software make the low entry requirements for software engineering practitioners look downright rigorous. I have previously criticized the fact that many organizations find the cheapest people, especially for software management and give them an obligatory certification. Now I know that engineering management regardless how formal or established the engineering discipline is an area that has problems and failures, but again I very much doubt that you would find a twenty something business or liberal arts major with a freshly minted PMP suffix managing construction or civil works projects. Yet in my experience this is pretty normal in IT especially in the government sector. Of course bad software management is executed by older managers as well.

A post by Larry White titled "Engineering Management Is Dying" delves into some these issues. It is definitely the case that methodologies like Agile have changed how software projects work and the traditional corporate management approach to software really doesn’t work. One point he makes is that it is not uncommon to see a two hundred person project at Google headed by an engineer. To me this implies that someone is in some way managing that project. I think all of this is indicative of the need for software engineering mangers to also be practitioners in engineering or at least very well versed in how software development works. The situation he talks about at Google is not the case everywhere, in that case the company is its own client, but there are many cases where software is being built for clients and this does create a need for management that deals with the client, perhaps this should be a role that is separated from management and called a client liaison, which is what often happens. Also I am not sure how things work at Google but I have also seen internal development in organizations where the IT department provides services to an internal client, I have also seen this go horribly awry with contentious unproductive relationships between departments. In short I feel that just as we need a real software engineering discipline we also need a real incarnation of software engineering management.

The Disconnect between Academia and Industry

Most practitioners do not keep up with, or read academic research, actually many practitioners don’t read at all, but that’s another issue, and most academicians don’t work in the field so they lack the hands on knowledge. Unfortunately this rift can take on a somewhat disdainful tone as is the case in my exploration of one practitioner’s attempt to define "Real Engineering". Fortunately some people, I’d like to count myself among them, take a more constructive approach to building bridges across this rift. Daniel Lemire has an interesting critique the quality of software produced in academia.

I think the solution here is for both sides to become more engaged in problems faced on each side. I am optimistic about this one as I feel that these walls are breaking down as the field is growing up and many of the emerging technologies are forcing developers to be more cognizant of and engaged in research topics. I also have encountered much more research that has ties to the practical concerns of day to day software development, some I have mentioned with more to come.

The Lack of an Effective CS Math Curriculum

If you read my blog you know it to be a mix of math and software practices. I would describe myself as a software practitioner and a math enthusiast. My math journey has taken me on some interesting math excursions into areas of math that seem to get little or no mention in CS curricula and I feel that this is a major problem in really applying math to the field of software engineering. Another problem is in the way that it is taught, my experience was that it was not only taught badly but in a way that made it seem irrelevant to many of the programming courses. Specifically I would shift the CS math curriculum to include less differential and integral calculus and include more logic, combinatorics, abstract algebra, graph theory, order theory, category theory and probably some topology among others.

Over the last few years I have been progressively learning more math which has been in part motivated by necessity for understanding research papers. This approach has changed the way I see things, I now see the patterns of math in software. They are there and I believe that they can be exploited to create a real software engineering discipline.

For the math learning problem I do have some ideas many of which are expressed in my blog. Some of my posts like refactoring if statements with Demorgan’s Laws or the String Monoid, the Object Graph, etc. are about making these mathematical ideas more relevant and accessible to programmers. Other posts like my series on naming and my post on generic programming and entropy and complexity are about my own ideas and ideas in the research literature that I feel may help to create a real software engineering discipline. These are my attempts at solutions to these problems, so expect more of both of these and more thoughts on the CS math curriculum.

The Software Architect Debacle

This is one that really bothers me. I feel the software architect role in general has a very negative effect on creating quality software and it dissuades the industry from developing a real engineering discipline. The Software Architect role tends to be broadly defined, you have enterprise architects, software architects, etc., some architects tend to be involved with hardware and networking, some work on configuration management, some do software design, some do all of these things and more. I feel this is a problem as each of these is a separate engineering area with different types of problems. By not breaking these down it often leads to a lack of focus on specific problems, also I have seen cases where the architect focuses on their preference and ignores other issues. The architect role is often divorced from the code. I have met many architects that were responsible for software but had never even looked at the code. To me this is a huge failing that architects that are responsible for building software often lack the interest, time, or even aptitude to know the quality or underlying of structure of the software that they are delivering.

To really do justice to these ideas I might need a separate post, my solution would be to break down the architect role into specific engineering roles, which could include:

Software Process Engineering - this would be involved with software team and resources planning, some aspects of configuration management such as defining policies, requirements analysis and general project management aspects of software construction including the definition and refinement of the SDLC. This role is tightly coupled with software engineering management.
Software Structural Engineering - this would include requirements comprehension, code infrastructure planning, prototyping work, hands on code structural work including custom frameworks and reusable components, third party library products, and general code quality including reviews and static analysis.
Software Quality Engineering - This would include the QA roles, software testing and testing tools, requirements validation and refinement, usability and reliability and general software quality issues.
Software Infrastructure Engineering - This would include hardware, networking, database, app server and general service infrastructure, non functional requirements fulfillment and possibly the implementation of configuration management, continuous integration, etc.

This is a rough "sketch" of these possible roles and each of these areas has some overlap implying that all of the people in these roles would work closely together, also some combination of, or all of these roles might be performed by a single individual depending on the organization’s size and structure. What this approach does is clearly defines roles and responsibilities as opposed to some nebulous software architect role.

The Continuous Turnover of the Work Force

A young CEO once commented that he thinks young programmers are superior. In general companies prefer younger workers as they often don’t have family commitments so they can work more hours and you can pay them lower salaries, DC area government contractors rely on this to keep their profit margins up.

As an older developer I often find this frustrating. I confess that I have not moved up the ladder and still mostly work as a developer or what might be called a hands on architect, yes I know but that’s the current term, actually my preferred title would be: Software Structural Engineer. Working as a contractor I have on several occasions found myself on projects that were dominated by younger developers and unfortunately on more than one occasion I watched as teams would make many mistakes due to a lack of experience, in some cases I was able to help in others I was ignored by the younger developers. This is not to say that I do not still make mistakes, I’ve just been around long enough to have made a lot of them already.

If you buy into the software craftsmanship thing, I admit to being skeptical of this idea, you might be inclined to draw a parallel to craftsman of the past when older established masters took on apprentices and passed on their wisdom to the next generation. Given the access to information these days such process is somewhat antiquated and perhaps impractical. Nevertheless I feel, and the industry supports me here, that older developers who keep their skills up to date have value. I know several younger developers who have sought me out to learn from me so I think I can safely say I still have some value.

Many older developers have let their skills get rusty and have not kept up on current technologies I have seen this many times. Also many companies seem to lack the vision to allow for real hands on technical career growth. I can’t help but feel that this inclination to recycle developers in each new generation is hurting or at least slowing our ability to develop a real software engineering discipline, this potential loss of continuity seems like it leads to some lost wisdom.

The Social Networking Drain and Hollywoodification of Silicon Valley

As I write this Hollywood gears up to deliver a reality TV show that takes place in silicon valley and Mark Zuckerberg now gets the paparazzi treatment. Another article that I previously mentioned was about how CS enrollment was up because the movie The Social Network had enticed many aspiring Zuckerberg wannabes. At present there does seem to be a gold rush mentality and it is noticeable and even discussed on sites like Hacker News.

As someone who has never worked in the valley I am very much an outsider and may not have a good perspective on these things, but it seems that there are a lot of startup companies focusing on crap. Now I get it. We live in a shallow consumer society with a rapacious appetite for crap. But are not these supposed to be some of the smartest people and I am not the only one to wonder why so many smart people who should have better taste and higher standards are settling for working on serving up ads and other vapid crap instead of doing more meaningful work like working on real problems that would advance human society or even just work on advancing software engineering.

¹I believe this was the case in the late 90’s and may have changed, this information was conveyed to me by a former Microsoft employee.

17 July 2012

A Framework for Software System Naming

On the Importance of Naming in Software Systems part IV

In my previous posts on naming I have laid the foundation for a framework for naming in software systems. This post is mostly a rehashing of those previous ideas and is intended to both codify and consolidate the definitions of terms upon which I have chosen and created to build a naming framework. It also serves as a single location to concisely reference these ideas. To gain more insight into the background and development of these ideas see my previous three posts: "A Survey of the Conventional Wisdom on Software Naming", "More Thoughts on Formal Approaches to Naming in Software", and "Software System Morphology". The definitions are as follows:

Structural

Name – The Signifier that consists of a string of characters that refers to a System Referent.
Name Unit Separator – is a mechanism to separate name units within a Name, this can be an explicit character such as "-" or "_". It can also be in implicit mechanism such as the use of Camel Case within the name.
Name Unit – Is the smallest lexical unit of a Name usually separated by a Separator (Name Unit Separator) and is used to describe the lexical structure of the Name.
Named Scope Context – this is the location of the Name within the system, e.g. where the name usage occurs.
System Referent – This is the actual instance of a thing that is named and is used in the system. See System Referent Type for specific types of System Referents.
System Referent Type - This is the specific type or class, as in classification, of what a System Referent is in a system some examples include:
- Classes
- Variables (local, instance, static)
- Methods
- Method Parameters
- Packages
- Database Tables, Columns, Triggers, Stored Procedures, etc.
- HTML Files, CSS Files, Javascript Files, Config Files, all files
- Directories
- Urls
- Documents
- XML Elements and Attributes
- CSS Classes

Semantic

System Morpheme – This is a conceptual unit of meaning in a system that is used in a name, it is distinguished from a Name Unit in that a morpheme can consist of one or more Name Units.
System Natural Language Set – This is the set of Natural Languages used to create the names. It is often English but it need not be and a system could include components or be constructed in a way that might include hybridization between multiple natural languages.
Problem Domain - This is the Semantic Domain that describes problem area for which the system solution is targeted. This would be a set of concepts and System Morphemes that apply to the problem domain. For example a financial system would be built using the relevant financial terms and language concepts.
Solution Domain – This is the Semantic Domain that describes concepts that are used to implement the system including System Morphemes that apply to the solution domain. This might include things like terms that describe Design Patterns, data structures, etc.

As you can see from above these Ideas can be separated into two categories: Structural and Semantic. However the two categories are not completely independent of each other so there is some overlap and interdependence. The structural ideas include: Name Unit, Named Scope Context, System Referent, and a System Referent Type. Each Name has these properties as attributes. A name has a lexical structure made up of Name Units, and refers to a System Referent of some System Referent Type and has one or more locations given by its Named Scope Context(s).

The semantics of the names which has the "unit" described by the System Morpheme idea has semantic properties which will fall into either the Problem Domain or the Solution Domain. The expression of these concepts is dependent on the System Natural Language Set, in other words the natural languages that are used to construct meaningful names for concepts in both the Problem Domains and the Solution Domains.

This is the current state of my thinking on this. These ideas will be refined and might change over time, the objective is to develop something that is useful and that both supports and can be validated by a more analytic approach.

13 May 2012

Software System Morphology

On the Importance of Naming in Software Systems, part III

In my previous posts on naming "A Survey of the Conventional Wisdom on Software Naming" and "More Thoughts on Formal Approaches to Naming in Software" I discussed several ideas about defining a way to talk about software system naming, I feel the name "Software System Morphology" captures this "big picture" concept. My objective is to develop a comprehensive way to define and analyze the semantic nature (morphology) of the construction of a software system. It’s an ambitious undertaking and whether or not I succeed in this endeavor I am hoping that these ideas will contribute to a better solution to this problem.

To review, I am defining three attributes that a name of signifier can have, the name scope context, i.e. where the name is used or occurs, the type or class of the system referent signified by the name, and the lexical structure of the name i.e. the n-gram of name units. I feel that these three qualities can be used to determine and possibly measure aspects of the morphological structure of software and by determining the morphological structure of a software system that structure might be "comprehended" and refined which could yield higher quality software.

In Linguistics a morpheme is the smallest unit of meaning and it is considered, by some, to be a complete sub-discipline of the study of Linguistics. I admit that I am in the process of learning more about it, I am a few chapters into What is Morphology by Mark Aronoff and Kirsten Fudeman it is an excellent book for linguistic neophytes like me. Software System Morphology will both be divergent and dependent on natural language morphological concepts, which are quite complex. However a general understanding of linguistic morphology will make these ideas more accessible, I suggest looking into it on your own, it’s fascinating and worth it solely on its own merits. A quick example of English Morphology can be given with the morphologically complex word autobiography which consists of the morphemes: auto/bio/graph/y, where [auto] means "self", [bio] means "life", [graph] means "drawn" and [y] means "characterized by".

In my first post of this series I explored various ideas proposed by a few prominent software engineers, one idea that I discussed was the use of what were denoted as qualifiers which included the scope qualifier, the type qualifier and the computed-value qualifier. These qualifiers which are prefixed or suffixed to names can be thought of more generally as morphemes. So we can now incorporate those ideas into a more general way of looking at meaning in software naming. Now the natural conclusion, if you have been paying attention, may be to assume that the name unit concept will map to the morpheme concept, it does not, at least in terms of only software morphology and this is in interesting distinction, software morphemes may consist of n-grams of name units which can be natural language morphemes two examples are CommandPattern and MedicalRecord where separating these into individual name units potentially become meaningless in certain software contexts.

Software System Morphology is primarily influenced by three things, the semantic nature of the problem domain(s), the semantic nature of the solution domain(s), which include languages ORMs, paradigms, design patterns, data structures, and finally the natural language(s) in which both the problem domain and solution domains are expressed. The previous example items CommandPattern and MedicalRecord illustrate these ideas as well.

Ok now let’s take our concepts further with a more detailed example. Assume we have a Hospital Medical system which has the following problem domain concepts: [Medical Record, Patient, Visit, Doctor, Hospital, Address, Invoice]. Obviously there are more. For the solution domain we will assume a JEE/Spring style implementation which might include the following concepts: [Entity, Dao, ServiceLayer, Controller, Map, View]. Now some expected names (signifiers) might be: MedicalRecordEntity, PatientEntity, PatientServiceLayer, PatientDao, PatientController, PatientView, VisitEntity, VisitDao, VisitServiceLayer, VisitController, VisitView, PatientVisitMap, PatientHospitalVisitor,... Now as you can see the names are composed of morphemes from both the problem domain and the solution domain. This is to be expected as each name tells what it is/does in the problem domain and what it is/does in the solution domain. In fact any developer with knowledge of JEE Spring Web application developer can look at these names and know what to expect and where they fit in the system. Additionally it is easy to look at the above set of domain concepts and anticipate named system referents that I omitted.

Previously I pointed out that the idea referred to as a qualifier is really a morpheme and while this is true the term qualifier seems to denote a more contextual setting in terms of the domain concepts. In the above examples the solution domain morphemes act as qualifiers telling where and how each named system referent fits into the solution Entity tells that it is a domain object, Dao tells that it relates to persistence, ServiceLayer tells that one would expect to see a separation of concerns that would put the business logic here. Structurally one would expect the domain objects (with the Entity suffix) to be passed back from the Dao’s and ServiceLayer’s and there might be transformative manipulation or aggregation of the domain objects in the ServiceLayer. The idea of qualification is meaning based but in this case it also indicates structural aspects of the solution domain. In this example I intentionally use the Entity suffix, in many systems the domain concepts would be used alone i.e. [Patient, Visit], I have mixed feelings about this in one sense it can be viewed as potentially unnecessary, in another sense it represents a qualifier that can possibly improve comprehension. Qualifiers do make names longer and while I tend to favor longer names it can become problematic if they get too long. These ideas towards naming are powerful for building comprehensible maintainable code as you can see from the above name examples that they follow a consistent pattern, admittedly these are fairly easy naming tasks, but you would be surprised how many developers fail to even do these types of things consistently.

Unlike others who advocate specific practices in regards to naming I feel that naming should be dealt with in a flexible framework that can be adapted to a specific project’s needs. While I personally agree and disagree with ideas previously discussed, my philosophy is that naming should be done consistently and comprehensively reflecting deliberate choices using conceptual understanding and analysis of the Domains. Also it should be tracked in a Lexicon, an idea that I am working on and hope to explore in a later post. Also as I have said before I feel that tooling could be used to allow better conceptual understanding and navigation and the use of analytics and possibly visualization could be used to ease naming decisions in practice by providing improvements to both static analysis and interactive autosuggest/intellisense aids.

I feel that these ideas are an attempt to create a more "formalized" approach to what Eric Evan’s does in Domain Driven Design which also covers a number of other structural ideas, I hope to extend my ideas here to better integrate some of those as well. The next steps for me are to undertake an analytic approach I also hope to explore the application of ideas like formal concept analysis and spectral graph theory techniques which may add some mathematical rigor to these ideas and perhaps allow some empirical validation and probably refinement. Now it gets really hard so it might be a while for those follow-ups.

29 April 2012

What is Generic Programming?

Over the course of my career I have strived to write reusable code, always looking for more general patterns and trying to refactor code to more flexible more generic versions sometimes with mixed success. Lately it has become something of an obsession of mine in that I both try to employ it and have been endeavoring to learn more about it, what it really is, and how to do it right, as I have previously pondered I believe there is a complexity cost with increased reuse. Over the last few years, which I have been programming in Java, generics has become a central tool in achieving reuse. So such a concept comes to mind but it is just one idea in a larger framework of ideas on generic programming.

There is a fair amount of research and discussion of the ideas pertaining to generic programming, the title of this post is taken directly from a paper on the topic, which perhaps not so coincidently was presented at the same OOPSLA (2005) as the paper on Software Reuse and Entropy that I previously discussed. It just seems like every idea I explore is super-interrelated to everything else and this post will be no exception, I can conceptually connect these two papers using a Kevin Bacon like approach using just two other research papers, ideas I have for other posts. I will try to keep this one as focused as possible. Also the referenced paper is, like the Software Framework paper, very math heavy but different math, Category Theory,Order Theory, Universal Algebra, etc. This post will focus on more high level concepts and not the details of math that I wish I could claim that I fully understand.

So back to our question, what is generic programming? For me it is about reuse, and while the information entropy approach to understanding reuse was about reuse from perspective of measure and the reuse (measurable) variation on different domains, I feel that the generic programming concept is about the structure of reuse and how it is implemented in various programming paradigms but the concept itself transcends language specific interpretations and implementations.

As you can tell I am not sure how to exactly answer this question so I will defer to the experts.

"What is generic programming?" in part outlines the paradigmatic distinctions between mainstream OO languages (Java Generics, C++ with STL) and Functional languages (Haskel, Scheme):

As for most programming paradigms, there are several definitions of generic programming in use. In the simplest view generic programming is equated to a set of language mechanisms for implementing type-safe polymorphic containers, such as List<T> in Java. The notion of generic programming that motivated the design of the Standard Template Library (STL) advocates a broader definition: a programming paradigm for designing and developing reusable and efficient collections of algorithms. The functional programming community uses the term as a synonym for polytypic and type-indexed programming, which involves designing functions that operate on data-types having certain algebraic structures.

In "Generic Programming" written in 1988 by David A. Musser and Alexander A. Stepanov use Ada and Scheme and describe generic programming as:

By generic programming, the definition of algorithms and data structures at an abstract or generic level, thereby accomplishing many related programming tasks simultaneously. The central notion is that of generic algorithms, which are parameterized procedural schemata that are completely independent of the underlying data representation and are derived from concrete efficient algorithms.
...
A discipline that consists of the gradual lifting of concrete algorithms abstracting over details, while retaining the algorithm semantics and efficiency.

That last sentence has a very Agile/Refactoring feel to it.

Ralf Hinze in "Generic Programs and Proofs" describes it as:

A generic program is one that the programmer writes once, but which works over many different data types.
...
Broadly speaking, generic programming aims at relieving the programmer from repeatedly writing functions of similar functionality for different user-defined data types. A generic function such as a pretty printer or a parser is written once and for all times; its specialization to different instances of data types happens without further effort from the user. This way generic programming greatly simplifies the construction and maintenance of software systems as it automatically adapts functions to changes in the representation of data.
...
So, at first sight, generic programming appears to add an extra level of complication and abstraction to programming. However, I claim that generic programming is in many cases actually simpler than conventional programming. The fundamental reason is that genericity gives you "a lot of things for free" ...

This last statement resonates with me as it parallels my thoughts on the desirability of using software frameworks and I feel that I continuously have this conversation (argument) with junior level programmers and also senior level cargo cult coders who don’t get it.

Design Patterns are a common generic concept in mainstream programming. It turns out that a number of design patterns can be described in the framework laid out by these papers and a number of papers do exactly that including "Design Patterns as Higher-Order Datatype-Generic Programs " part of extensive research on the topic by Jeremy Gibbons:

Design patterns, as the subtitle of the seminal book has it, are ‘elements of reusable object-oriented software’. However, within the confines of existing mainstream programming languages, these supposedly reusable elements can only be expressed extra-linguistically: as prose, pictures, and prototypes. We believe that this is not inherent in the patterns themselves, but evidence of a lack of expressivity in the languages of today. We expect that, in the languages of the future, the code parts of design patterns will be expressible as directly-reusable library components. The benefits will be considerable: patterns may then be reasoned about, typechecked, applied and reused, just as any other abstractions can. Indeed, we claim that the languages of tomorrow will suffice; the future is not far away. All that is needed, in addition to what is provided by essentially every programming language, are higher-order (parametrization by code) and datatype-generic (parametrization by type constructor) features. Higher-order constructs have been available for decades in functional programming languages such as ML and Haskell. Datatype genericity can be simulated in existing programming languages [2], but we already have significant experience with robust prototypes of languages that support it natively [2]. We argue our case by capturing as higher-order datatypegeneric programs a small subset ORIGAMI of the Gang of Four (GOF) patterns. (For the sake of rhetorical style, we equate ‘GOF patterns’ with ‘design patterns’.) These programs are parametrized along three dimensions: by the shape of the computation, which is determined by the shape of the underlying data, and represented by a type constructor (an operation on types); by the element type (a type); and by the body of the computation, which is a higher-order argument (a value, typically a function). Although our presentation is in a functional programming style, we do not intend to argue that functional programming is the paradigm of the future (whatever we might feel personally!). Rather, we believe that functional programming languages are a suitable test-bed for experimental language features - as evidenced by parametric polymorphism and list comprehensions, for example, which are both now finding their way into mainstream programming languages such as Java and C#. We expect that the evolution of programming languages will continue to follow the same trend: experimental language features will be developed and explored in small, nimble laboratory languages, and the successful experiments will eventually make their way into the outside world. Specifically, we expect that the mainstream languages of tomorrow will be broadly similar to the languages of today — strongly and statically typed, object-oriented, with an underlying imperative mindset — but incorporating additional features from the functional world — specifically, higher-order operators and datatype genericity

Lately I have been making more of an effort to learn Scala. Bruno Oliveira and Jeremy Gibbons make the case in "Scala for Generic Programmers" that it might be one of or even the best language for expressing Generic Programming concepts in part due to its hybridization of object oriented and functional paradigms. Here is the abstract in its entirety:

Datatype-generic programming involves parametrization by the shape of data, in the form of type constructors such as ‘list of’. Most approaches to datatype-generic programming are developed in the lazy functional programming language Haskell. We argue that the functional object-oriented language Scala is in many ways a better setting. Not only does Scala provide equivalents of all the necessary functional programming features (such parametric polymorphism, higher-order functions, higher-kinded type operations, and type- and constructor-classes), but it also provides the most useful features of object-oriented languages (such as subtyping, overriding, traditional single inheritance, and multiple inheritance in the form of traits). We show how this combination of features benefits datatype-generic programming, using three different approaches as illustrations.

So far for me Scala has been a joy. Its features and flexibility are beautifully elegant of course I am tainted coming from the clumsy, clinking, clanking, clattering caliginous world of Java, BTW Java we need to talk, it’s not you, it’s me. I’ve met someone else.

Jeremy Gibbons Makes the following point in "Datatype-Generic Programming":

Generic programming is about making programming languages more flexible without compromising safety. Both sides of this equation are important, and becoming more so as we seek to do more and more with computer systems, while becoming ever more dependent on their reliability.

This idea is similar to ideas that Gerald Sussman explores in his talk at Strangeloop 2011 "We Really Don’t Know How To Compute". However, his general and perhaps philosophical approach to the idea of generic programming is very different from the previous ideas:

[8:40]We spend all of our time modifying existing code, the problem is our code is not adequately evolvable it is not adequately modifiable to fit the future in fact what we want is to make systems that have the property that they are good for things that the designer didn’t even think of or intend. ... That’s the real problem. ... The problem is that when we build systems whatever they are we program ourselves into holes. ... It means that we make decisions early in some process that spread all over our system the consequences of those decisions such that the things that we want to change later are very difficult to change because we have to change a very large amount of stuff. ... I want to figure ways that we can organize systems so that the consequence of the decisions we make are not expensive to change, we can’t avoid making decisions but we can try to figure out ways to minimize the cost of changing decisions we have made.
...
[11:40]Dynamically extensible generic operations ... I want to dynamically extend things while my program is running ... I want a program to compute up what it is going to do.

He shows an example[19:00] which is "the definition of the method of computing Lagrange Equations from a Lagrangian given a configuration space path". This is apparently easy stuff, to him at least. Fortunately you don’t really need to understand the details of this math to understand his points here:

[20:45]This is a little tiny feeling of what is it needed to make things more flexible consider the value here, the value is that I am dynamically allowed to change the meaning of all my operators to add new things that they can do. Dynamically. Not at compile time, at runtime.
...
[21:15]It gives me tremendous flexibility. Flexibility because the program I wrote to run on a bunch of numbers with only a little bit of tweaking suddenly runs on matrices so long as I didn’t make mistake of commuting something wrong. ... So this makes proving the theorems very hard In fact maybe impossible the cost and the benefits are very extreme. I can pay correctness or proofs of correctness or belief in correctness, I can pay that for tremendous flexibility. ... Is correctness the essential thing I want, I am not opposed to proofs, I love proofs I am an old mathematician. The problem is that putting us into the situation that Mr. Dijkstra got us into ... where you are supposed to prove everything to be right before you write the program getting into that situation puts us in a world where we have to make the specifications for the parts as tight as possible because it’s hard to prove general things, except occasionally it’s easier but it’s usually harder to prove a general theorem about something so you make a very specialized thing that works for a particular case you build this very big tower of stuff and boy is that brittle change the problem a little and it all falls over.

His ideas tend towards adaptable and self modifying code at one point he talks about self modifying code in both assembly language and the use of eval in Lisp. He also touches on a number of parallels in the biological world including genetics and the adaptability of the human immune system to fight off mutating parasites. Interestingly some of his ideas seem similar to work done by Stephanie Forrest on using evolutionary/genetic algorithms for program repair which she speaks about at the keynote of SPLASH 2010, formerly known as OOPSLA. The genetic repair approach at the conceptual level is striving for the same results as what Gerald Sussman describes. Also did you notice that the Levenshtein Distance between "generic programming" and "genetic programming" is (one substitution), pretty weird!

Seriously though, the objectives of all of these approaches are the same: building better more flexible software that handles more situations. These ideas are reflected in how Glenn Vanderburg describes software engineering:

Software engineering is the science and art of designing and making, with economy and elegance, systems so that they can readily adapt to situations which they may be subjected.

I feel this Software Engineering insight reinforces my opinion that we will eventually see a real math backed discipline of software engineering and I think that the research work discussed here both supports my conjecture on this point and most likely lays part of that foundation. I also think some of these ideas reflect ideas I discussed in "The Software Reuse Complexity Curve" which implies that programmers will need to become more aware and more adept at more advanced theoretically based concepts like for example GADT’s (Generalized algebraic datatypes), etc. and of course math in general such as Abstract Algebra.

Clearly the idea of Generic programming has some very specific ideas and some very general objectives. There seem to be two approaches here which might described Predictive vs. Adaptive, much like the idea of Separation of Design and Construction described by Martin Fowler. The Gibbons et al more calculational approach is predictive whereas the Sussman/Forrest ideas describe an adaptive approach. I suspect that this will just be another dimension of program design that will have to be considered and assessed as a tradeoff in its application, which Sussman also mentions.

Elegant Coding

10 April 2017

The Problem with Today’s Software Thought Leaders

References and Further Reading

26 September 2016

Software Development is a Risky Business

References and Further Reading

23 March 2015

What is Software Correctness?

REQUIREMENTS

STRUCTURAL

BOTH

References and Further Reading

15 March 2014

Are Your Developers Treating Your Project as Their Own Personal Science Project?

07 August 2012

Why You Can’t Have a Real Software Engineering Discipline

Low Entry Requirements

The Lack of Differentiation between Science and Engineering

Bad Management

The Disconnect between Academia and Industry

The Lack of an Effective CS Math Curriculum

The Software Architect Debacle

The Continuous Turnover of the Work Force

The Social Networking Drain and Hollywoodification of Silicon Valley

17 July 2012

A Framework for Software System Naming

On the Importance of Naming in Software Systems part IV

Structural

Semantic

13 May 2012

Software System Morphology

On the Importance of Naming in Software Systems, part III

29 April 2012

What is Generic Programming?

About Me