Moved

Moved. See https://slott56.github.io. All new content goes to the new site. This is a legacy, and will likely be dropped five years after the last post in Jan 2023.

Showing posts with label java. Show all posts
Showing posts with label java. Show all posts

Tuesday, July 12, 2016

Getting Rid of the Gang-of-Four Design Patterns is Nonsense

Someone found Yet Another Post (YAP™) insisting that the Gang of Four (GOF™) patterns were on their last legs. The email was misleading, because this is not precisely what the article said. The bottom-line was that Design Patterns in general are merely a response to gaps in the underlying programming language. A position that's nonsense at its very foundation.

The lexicon of design patterns varies from language to language. GoF patterns aren't "going away." They're part of the Java/C++ world. They don't apply quite the same way to Python or functional languages.

There's a more serious issue, though: Language Mapping. First some background.

Design Patterns

Design Patterns will always exist. They're an artifact of how we process the world. We tend to classify individual objects so that we don't have to deal with each object as a separate wonder of nature.

It's Just Another Brick In The Wall.

We don't have to examine each rectangular solid of ceramic and understand the wonderfulness of it. We can group and summarize. Classify. Brick is a design pattern. So is masonry. So is wall. They're all patterns. It's how we think.

Design Patterns and Language Gaps

There's a claim that moving toward functional languages will kill design patterns. This presumes (partly) that non-OO languages magically don't have design patterns. This is (see above) kind of insane. Languages have design patterns. We recognize these patterns all the time.

A functional language has a common technique (or pattern) for visiting nodes in a hierarchy. We don't dwell on the wonderfulness of the code as if we'd never seen it before. Instead, we classify it based on the design pattern, and leverage this higher-level understanding to figure out why we're walking a hierarchy.

Sounding the death knell for design patterns also presumes (partly) that functional languages are magically more complete that OO languages. In this newer better language, we don't need patterns because there are no gaps. This is pretty much nutso, too. The Patterns Fill Language Gaps school of thought ignores the fact that there are many ways to implement these "gaps". We can use GoF design patterns, or we can use other software designs that don't fit the GoF design patterns. Both work.

The patterns aren't filling a "gap." They're providing guidance on how to implement something. That's all. Nothing more. Guidance.

"But wait," you say, "since I needed to write code, that's evidence that there's a gap."

"What?" I ask, incredulous. "Are you claiming that any code is evidence of a language gap? Does that mean all application software is just a language gap?"

"Let's not be silly," you say. "I can split a hair and create a tiny distinction between software I shouldn't have to write and software I should have to write."

I remain incredulous.

Design Patterns as Damage

The idea that somehow the GoF design patterns are a problem is also goofy. The GoF design patterns are pretty slick. They solve a fairly broad suite of problems in an elegant and consistent manner.

They're just good design.

Yes, they can be complex. Sorry about that. Software can be complex if you want really excellent flexibility and extensibility.

AND.

Bonus.

Software can be complex when you have to work around the problems of "compiler" and "locked libraries" and "no source." That is, the GoF patterns apply in full force for C++ and Java where you're trying to protect your intellectual property by disclosing only headers and obfuscated implementation details. Indeed, there are few alternatives to the GoF patterns if you're going to distribute a framework that has no visible source and needs to leave extension points for users.

If you don't have Locked-NoSource-Compiled code as a backdrop, the GoF patterns can be simplified a little. But some of the patterns are essential. And remain essential. There are some really great ideas there.

In Python world, we rely on a modified subset of the GoF patterns. They work extremely well.

When writing functional-style Python using immutable data structures (to the extent possible), we use a different set of design patterns. Not so many GoF patterns when we're trying to avoid stateful objects. But some patterns (like the Abstract Factory) are really very helpful even in a largely functional context. It morphs from an abstract factory class to a factory function, and it loses the "abstract" concept that's part of C++ and Java, but the core Factory design pattern remains.

The Serious Issue

The serious issue that is surfaced by the email is Language Mapping. We cannot (and must not) try to map languages to each other. What is true for Java design is emphatically not true for Python design. And it doesn't apply to assembly languages, FORTRAN, FORTH, or COBOL.

Languages are different.

There. I said it.

If there was an underlying "universal deep structure" behind all programming languages, the surface features would be merely syntax, and we'd have automated translation among languages. The universal deep structure (the underlying Turing Machine that does computations) appears to be too abstract to map well among programming languages. Hence the lack of translators.

When switching among languages, it's important to leave all baggage behind.

When moving from Java < 8 to Java >= 8<8 java="" to=""> (i.e., non-functional Java to more functional Java) we can't trivially map all design patterns among the language features. It's a new language with new features that happens to be compatible with the old language.

Attempting to trivially map concepts between non-functional (or strictly-OO Java) and more functional Java leads to dumb conclusions. Like the GoF patterns are dying. Or the GoF patterns represent damage or something else equally goofy.

The language changes lead to design pattern changes.

Language change doesn't deserve an gleeful/anguished blog post celebrating/lamenting the differences. It's a consequence of learning a new language, or new features of an existing language.

Please avoid mapping languages to each other.

Tuesday, January 12, 2016

"Learn Python" is growing

https://dzone.com/articles/learn-python-overtakes-learn-java

I've seen companies making sincere enterprise-wide commitments to doing all data analysis in Python. There's no reason for quants and analysts to struggle with Java.

I field one or two questions a week from folks pushing the in-house envelope on data acquisition.

I also field questions on language basics.

It's a very exciting thing.

I'm glad I started down this road 15 years ago.

Tuesday, April 14, 2015

Java vs Python


Seems silly at first.

http://blogs.perceptionsystem.com/infographic/java-vs-python-programming-language-productive

It's mostly true. It's also incomplete.

For example, "no tuples in Java" ignores http://commons.apache.org/proper/commons-lang/apidocs/org/apache/commons/lang3/tuple/package-summary.html.

It seems like the only interesting takeaway from the infographic is that Java syntax is wordy.

One hesitates at a detail-by-detail comparison. Does it help to match the 20 or so statements in Python against equivalents in Java?  Does it help to match the byzantine complexity of Java public/protected/private against something that doesn't exist in Python? Does it even make sense to try and compare the complexity of Java annotations with anything? What about Python meta-programming? The fact that we can easily overload Python operators?

Perhaps it's only because I'm an expert in both languages that I hesitate to try point-by-point comparison.

Tuesday, March 10, 2015

It appears that DevOps may be more symptom than solution

It appears that DevOps may be a symptom of a bigger problem. The bigger problem? Java.

Java development -- with a giant framework like Spring -- seems to accrete layers and layers of stuff. And more stuff.  And bonus stuff on top the the excess stuff.

The in-house framework that's used on top of the Spring Framework that's used to isolate us from WebLogic that used to isolate us from REST seems -- well -- heavy. Very heavy.

And the code cannot be built without a heavy investment in learning Maven. It can't be practically deployed without Hudson, Nexus, and Subversion. And Sonar and Hamcrest and mountains of stuff that's implicitly brought in by the mountain of stuff already listed in the Maven pom.xml files.

The deployment from Nexus artifacts to a WebLogic server also involves uDeploy. Because the whole thing is AWS-based, this kind of overhead seems unavoidable.

Bunches of web-based tools to manage the bunch of server-based tools to build and deploy.

Let me emphasize this: bunches of tools.

Architecturally, we're focused on building "micro services". Consequently, an API takes about a sprint to build. Sometimes a single developer can do the whole CRUD spectrum in a sprint for something relatively simple. That's five API's by our count: GET one, GET many, POST, PUT and DELETE: each operation counts as a separate API.

Then we're into CI/CD overheads. It's a full sprint of flailing around with deployment onto a dev servers to get something testable and get back test results so we can fix problems. A great deal of time spent making sure that all the right versions of the right artifacts are properly linked. Doesn't work? Oh. Stuff was updated: fix your pom's.

It's another sprint after that flailing around with the CI/CD folks to get onto official QA servers. Not building in Husdon? Wrong Nexus setup in Maven: go back to square one. Not deployable to WebLogic? Spring Beans that aren't relevant when doing unit testing are not being built by WebLogic on the QA server because the .XML configuration or the annotations or something is wrong.

What's important here is that ⅔ of the duration is related to the simple complexity of Java.

The DevOps folks are trying really hard to mitigate that complexity. And to an extent, they're sort of successful.

But. Let's take a step back.

  1. We have hellish complexity in our gigantic, layered software toolset.
  2. We've thrown complicated tools at the hellish complexity, creating -- well -- more complexity.

This doesn't seem right. More complexity to solve the problems of complexity just don't seem right.

My summary is this: the fact that DevOps even exists seems like an indictment of the awful complexity of the toolset. It feels like DevOps is a symptom and Java is the problem.

Thursday, December 4, 2014

Architectural Principles, Spring Framework, and Jersey JAX-RS

See this: http://www.moschetti.org

Attended a meeting with Buzz. Not stated in his blog (in an obvious way) was something he said about not being a fan of big frameworks. I didn't write down his punchline, but it was a pretty pithy summary of the framework tradeoff.

IIRC, it was essentially this: you can wrestle with one or both of these technical problems.
  • Boilerplate Code
  • A Framework's Conceptual Model
Either you have to create your own libraries or you have to learn someone else's. This is in addition to wrestling with the business problem you're supposed to be solving.

Buzz's point seemed to be that you can often manage your own boilerplate more easily than you can come to grips with a framework. If one member of your sprint team handles reusable services, you can just ask them for a feature. You don't have to spend an hour reading other people's struggles.

After spending three months getting my brain wrapped around Spring Framework, I'm inclined toward partial, qualified agreement. Frameworks seem to have limited value until you're an expert in using them.

Layers and Layers

When wrestling with a new feature, you are forced to assume that you've understood its semantics. When you mock a framework element for test purposes, you're reduced to hope that your unit tests are sufficient. A unit tests of a mocked framework element only tests your assumptions. If you're not using the element's API correctly, your tests can't show that the framework will break or raise exceptions.

For new technology, you need to start with a technical spike to understand the framework. Then you can write unit tests that test against known framework behavior. Then you can write the real code that's based on the unit tests that are based on a spike that shows how the framework really works.

Using a technical spike for discovery and debugging can be challenging. You don't want to drag around your entire application just to create a spike. But you don't want to drop back to a trivial "hello world" spike that doesn't really apply to your context. You have to balance simplicity against realism.

For example, making JAX-RS requests to web services is aggravating to debug. You can spend many hours looking at boilerplate 401 and 404 errors wondering what's missing. You can't write the unit tests until you finally get something to work. Once you have something, you can replace real objects with mock objects.

If you already know JAX-RS features, it's easy. If you already know the RESTful service, it's not too bad. If you know neither JAX-RS nor the service, you don't have any clue which direction to turn. Did I misuse JAX-RS? Is something wrong in the request? Am I missing a required header? Did I leave something off the Accept header?

I finally had to give up creating spikes and debugging RESTful requests in Java. It turned out to be simpler to write a version of the REST client in Python. I used this to figure out how the real service really worked. Given a working Python spike, I could then save those interactions for WireMock.

Once I has a clue how the service worked, I could also write a mock server for some more sophisticated experiments.  This was useful for debugging problems based on a failure to understand JAX-RS.

Yes. Rather than struggle with the framework, I wrote the client once in Python and then rewrote the client again in Java. It seemed quicker than trying to debug it in Java.

One contributing factor is the 1m 30s build time in Maven. Compare that with interactive  Python at the >>> prompt.

Perhaps a smaller framework would have been better.

Thursday, October 16, 2014

Using Bottle as a miniature demo server

Let's talk small.

When writing API's, it sometimes helps to have a small demo web site to show the API in a context that's easy to visualize. API's are sometimes abstract, and without an application to provide some context, it can be unclear why the path looks like that or why the JSON document has those fields.

I want to emphasize the "small" part of the small demo. A small page or two with some HTML forms and a submit button. Really small.

The actual customer-facing apps (mobile, mobile web, and full web site) are being built by lots of other people. Not us. They're big. We build the API's (there are a lot) that support the data structures that support the processing that supports the user experience.

Building fake mobile apps is right out. We're not going to lard on Android SDK or Xcode development environments to our already overburdened laptops. We build backend API's.

Building a fake mobile web or full web site is appealing. What makes it complex is the UX folks are building everything in Angular.js. If we want to properly implement a form, we would have to master Angular just to do a demo for the product owner.

No thanks. Still too far afield for API developers. We're focused on mongo and JSON and performance and scalability. Not Angular.js and the UX.

What we want to do is build a small web server which runs just a few pages plucked out of the UX demo code so that we can show how interactions with a web page put stuff in a database. And vice-versa: stuff in the database will show up on a web page.

"Really?" we get asked. Some folks look askance at us for wanting to put a small demo site together.

"Yes," we answer. "Our product owner has a big vision and we're breaking that into a bunch of little API's. It's not perfectly clear how we're building up to that vision."

It's not perfectly clear how some of this work. Folks outside the scrum team have distracting questions. We want to have a page or two where we can fill in a form and click submit and stuff happens. This is far easier to explain than showing them Postman or SoapUI and claiming that this will support some user stories.

And as we grow toward the epic, the workflow aspects of this will grow. The stuff that admin "A" does after user "U" has made an initial request. Or the stuff that internal user "I" does after external user "X" has done something. But really, it's just a few small web pages. Small.

Imagine the demo. On laptop #1, we'll show user "X". On laptop #2, we're running a Mongo shell to query what's in the db. On laptop #3 we're showing user "I". The focus is really the API's. And how the API's add up to an epic collection of stories.

Serving some HTML pages

Just to make it painful, we can't simply grab the demo web pages out of the UX team's SVN repository. Why not? First, it's an Angular app. We can't just grab some HTML and go. The demo pages are served via node.js with Bower, so it's not even clear (to us) what the complete deployment looks like.

So. We cheated. We took a screen shot. We trimmed the edges of the page as .PNG files. We wrote our own form and cobbled in enough CSS to make it look close. We're not here to fake out the UX. We just want to enter some data and have it tickle our API. (Indeed, we have a "Not The Real Experience" on some pages.)

Initially, some of the team members tried serving these small pages with WebLogic. Then Jetty. It's not bad. But it's Java. It takes forever to build and deploy something after a trivial change. There are a lot of moving parts even with Jetty, and not all of them are obvious.

Since we're building "enterprise" API's, we're deeply enmeshed with every feature of the Spring Framework. Our STS/Eclipse environments are fat with add-ons and features.

While the Spring Framework ideal is to allow a developer to focus on relevant details and have the irrelevant details handled automagically, the magic almost gets in the way. These are small applications that are little more than a few static pages with forms and a submit button. Spring can do it, of course. But we're often testing our the actual API's in a Jetty server (or two). If the demo site requires yet another instance of Jetty with yet another configuration, our ability to cope diminishes.

How can we get back to small?

Python and Bottle

Python has several web servers built-in. We can use http.server. We can use wsgiref. Both of these are almost OK for what we want to do.

We can do better with two small downloads: Bottle and Jinja2. With these, we can build simple HTML pages that show some data. We can build simple servers that collect form data, use http.client to make API requests, and write copious logging details. We can write little bottle apps that handle just GETs and POSTs simply.

This is suitably small.

We can share the module with the Bottle object and the HTML mock-up pages. We can fire up the app in an instant on anyone's laptop, no matter what else they're running. We can tweak the server to adjust the logging or the API request or the form.

We actually run the server from within Idle. Make a change and hit F5 to redeploy after a change. It's small. It's fast. And it doesn't involve the huge complexities associated with Java.

Bottle doesn't do much. But what little it does do is a pretty tidy fit with tiny little demonstrations of super-simple HTML interactions.

Tuesday, October 29, 2013

When to choose Python over Java and vice versa ??: A Very Silly Question

The correct answer is: It Doesn't Matter.

In spite of this.

(A) The question gets asked.

And worse.

(B) It gets answered. And people take their answers seriously. As if there are Profound Differences among programming languages.

Among Turing Complete programming languages there are few Profound Differences.

The pragmatic differences are the relative ease (or pain) of expressing specific algorithms or data structures.

This means that there's no easy, blanket, one-size-fits-all answer to such a silly question.

You can have some code (or data) which is painful in Java and less painful in Python.

But.

You can also find an extension library that makes it much, much less painful.

This, alone, makes the question largely moot.

When it comes to a specific project, the question of the team's skills, the existing infrastructure, and any integration requirements are the driving considerations.

Because of this, an incumbent language has huge advantages.

If you've already got a dozen web sites in Java, there's no good reason to flip-flop between Java and Python.

If you're going to switch from some Java Framework to Django, however, you'd do this is part of a strategic commitment to drop Java and convert to Python.

To read the discussion, see LinkedIn Python Community. http://www.linkedin.com/groupItem?view=&gid=25827&type=member&item=5796513808700682243&qid=8612bee7-76e1-4a35-9c80-c163494191b0&trk=groups_most_popular-0-b-ttl&goback=%2Egmp_25827

Thursday, October 6, 2011

Command Line Applications

I'm old -- I admit it -- and I feel that command-line applications are still very, very important. Linux, for example, is packed full of almost innumerable command-line applications. In some cases, the Linux GUI tools are specifically just wrappers around the underlying command-line applications.

For many types of high-volume data processing, command-line applications are essential.

I've seen command-line applications done very badly.

Overusing Main

When writing OO programs, it's absolutely essential that the OS interface (public static void main in Java or the if __name__ == "__main__": block in Python) does as little as possible.

A good command-line program has the underlying tasks or actions defined in some easy-to-work with class hierarchy built on the Command design pattern. The actual main program part does just a few things: gather the relevant environment variables, parse command-line options and arguments, identify the configuration files, and initiate the appropriate commands. Nothing application-specific.

When the main method does application-specific work, that application functionality is buried in a method that's particularly hard to reuse. It's important to keep the application functionality away from the OS interface.

I'm finding that main programs should look something like this:


if __name__ == "__main__":
    logging.basicConfig( stream=sys.stderr )
    args= parse_args()
    logging.getLogger().setLevel( args.verbosity )
    try:
        for file in args.file:
            with open( file, "r" ) as source:
                process_file( source, args )
        status= 0
    except Exception as e:
        logging.exception( e )
        status= 3
    logging.shutdown()
    sys.exit( status )

That's it.  Nothing more in the top-level main program.  The process_file function becomes a reusable "command" and something that can be tested independently.

Thursday, January 6, 2011

Java PHP Python -- Which is "Faster In General"?

Sigh. What a difficult question. There are numerous incarnations on StackOverflow. All nearly unanswerable. The worst part is questions where they add the "in general" qualifier. Which is "faster in general" is essentially impossible to answer. And yet, the question persists.

There are three rules for figuring out which is faster.

And there are three significant problems that make these rules inescapable.

Rule One. Languages don't have speeds. Implementations have speeds.

Info on benchmarking. The idea of a benchmark is to have a single, standard suite of source code, which can be used to compare compilers, run-time libraries or hardware.

Having a standard suite of source is essential because it provides a basis for comparison. A single benchmark source is the fixed reference. We don't compare the top of the Empire State Building with the top of the Stratosphere in Las Vegas without specifying whether we care about height above the ground or height above sea level. There has to be some fixed point of comparison, some common attribute, or the measurements devolve into mere numerosity.

Once we have a basis for comparison (one single body of source code), the other attributes are degrees of freedom; the measurements we make will include the other attributes. This will allow a rational statement of what the experimental results where. We can then compare these various free attributes against each other. For details look at something like the Java Micro Benchmark.

Rule Two. Statistics Aren't a Panacea.

The reason there's no "in general" comparison among languages is because there are too many degrees of freedom to make any kind of rational comparison. We can make irrational comparisons, but that's the trap of numerosity -- throwing numbers around. 1250 vs. 1149, 1300 vs. 3177. What do they mean? Height above ground? Height above sea level? What's being measured?

There's a huge problem with claiming that statistics will yield an answer to which language implementation is faster "in general". We need some population that we can sample and measure. Problem 1: What the population are we measuring? It can't be "programs": we can't compare grep against Apache httpd. Those two programs have almost no common features.

What makes the population of programs difficult to define is the language differences. If we're trying to compare PHP, Python and Java, we need to find a program which somehow -- magically -- is common across all three languages.

The Basis For Comparison

Finding common programs degenerates into Problem 2: what programs could be comparable? For example, we have the Tomcat application, written in Java. We wouldn't want to write Tomcat in Python (since Tomcat is a Java Servlet container). We could probably write something Tomcat-like in PHP, but why try? So we can't just grab programs randomly.

At this point, we devolve to subjectivity. We need to find some kind of problem domain in which these languages overlap. This gets icky. Clearly, big servers aren't a good problem domain. Almost as clearly, command-line applications aren't the best idea. PHP does run from the command-line, but it's always contrived-looking because it doesn't exploit PHP's strengths. So we wind up looking at web applications because that's where PHP excels.

Web applications? Subjective? Correct. PHP is a language plus a web application framework bundled together. Java and Python -- by themselves -- are just languages and require a framework. Which Java (and Python) framework is identical to PHP's framework? Spring, Struts, Django, Pylons? None of these reflects a code base that's even remotely similar. Maybe Java JSP is similar enough to PHP. For Python there are several implementations. Sigh.

Crappy Program Problem

We can't easily compare programs because we're really comparing implementations of an algorithm. This leads to Problem 3: we picked a poor algorithm or did a lousy job of implementing it in the target language.

In order to be "comparable", we don't want to exploit highly-optimized or unique features of a language. So we tried to be generic. This is fraught with risks.

For example, Java and PHP don't have list comprehensions. Do we forbid them from our Python behchmark? In Python, everything is a reference, values cannot be copied. If we pick an algorithm implementation which depends on copying objects, Java may appear to excel. If we pick an algorithm implementation which depends on sharing references, Python may appear to excel.

Somehow we have to get past language differences and programmer mistakes. What to do?

Synthetic Benchmarks

Since we can't easily find comparable programs -- as whole programs -- we're left with the need to create some kind of benchmark based on language primitives. Statements or expressions or something. We can try to follow the Whetstone/Dhrystone approach of analyzing a bunch of programs to find the primitive constructs and their relative frequency.

Here's the plan. We'll take 100 PHP programs, 100 Java programs and 100 Python programs and analyze them to find the relative frequency of different kinds of statements. What then?

The goal is to create one source that reflects the statements actually used in the 300 programs we analyzed. In three different languages. Hmmm... Okay. We'll need to create a magical mapping among the statement constructs in the three languages. Well, that's hard. The languages aren't terribly comparable. A Python expression using a List Comprehension is the same thing a multi-statement Java loop. Rats.

The languages aren't very comparable at the statement level at all. And if we force them to be comparable, we're not comparing real programs, but an artificial mapping.

Virtual Machine Benchmarks

Since we can't compare the languages at the program level or the statement level, what's left? Clearly, the underlying interpreter is what we care about.

We're really comparing the Java Virtual Machine, the PHP interpreter and the Python interpreter. That's what we really care about.

And life is simple because we can compare Java, The Project Zero PHP Interpreter based on the JVM and Jython. We can look at "compiled" PHP, Java Class Files and Python .PYC files to find the VM primitives used by each language and then -- what? Compare the run-time of the various VM primitives? No, that's silly, since the run-times are all JVM run-times.

What We're Left With

The very best we can can do is to compare the statistical distribution of the VM instructions created by Java, PHP or Jython compilers. We could note that maybe PHP or Python uses too many "slow" VM instructions, where Java used more "fast" VM instructions. That would be an "in general" comparison. Right?

See? You can measure anything.

In this case, the compiler itself is a degree of freedom. Sadly, we're not comparing languages "in general". We're comparing the bytecodes created by various compilers. We're actually comparing compilers and compiler optimizations of the bytecode. Sigh.

That's not what we were hoping for. We were hoping for some kind of "in general" comparison of the language, not the JVM compiler.

Java has pretty sophisticated optimization. Python, however, eschews optimization. PHP has it's own weird issues. See this paper from Rob Nicholson from the CodeZero project on how to implement PHP in the JVM. PHP doesn't fit the JVM as well as Python does. So there's a weird bias.

Rule Three. Benchmarking Is Hard.

There is no "in general" comparison of programming languages. All that we can do is benchmark something specific.

It works like this.
  1. Stop quibbling about language performance "in general".
  2. Find something specific and concrete we plan to implement.
  3. Actually write the performance-critical piece in Java, PHP, Python, Ruby, whatever. Yes. Build it several times. Really. We don't want to use "language-independent" or "common" features. We want to optimize ruthlessly -- use the language the way it was meant to be used. -- use the various unique-to-the language features correctly and completely.
  4. Actually run the performance-critical piece to get actual timings.
  5. Since run-time libraries and hardware are degrees of freedom, we have to use multiple run-time libraries, multiple compiler optimization settings and multiple hardware configurations to make a proper decision on which language to use for our specific problem.
Now we know something about our specific problem domain and the available languages. That's the best we can do.

We can only compare a specific problem, with a specific algorithm. That's the basis for all benchmark comparisons. Since each implementation was well-done and properly optimized, the degree of freedom is the language -- and the run-time implementation of that language -- and the selected OS and hardware.

Thursday, December 30, 2010

Top Language Skills

Check out this item on eWeek: Java, C, C++: Top Programming Languages for 2011 - Application Development - News & Reviews - eWeek.com.

The presentation starts with Java, C, C++, C# -- not surprising. These are clearly the most popular programming languages. These seem to be the first choice made by many organizations.
In some cases, it's also the last choice. Many places are simply "All C#" or "All Java" without any further thought. This parallels the "All COBOL" mentality that was so pervasive when I started my career. The "All Singing-All Dancing-All One Language" folks find the most shattering disruptions when their business is eclipsed by competitors with language and platform as a technical edge.

The next tier of languages starts with JavaScript, which is expected. Just about every web site in common use has some JavaScript somewhere. Browsers being what they are, there's really no viable alternative.

Weirdly, Perl is 6th. I say weirdly because the TIOBE Programming Community Index puts Perl much further down the popularity list.

PHP is next. Not surprising.

Visual Basic weighs in above Python. Being above Python is less weird than seeing Perl in 6th place. This position is closer to the TIOBE index. It is distressing to think that VB is still so wildly popular. I'm not sure what VB's strong suit is. C# seems to have every possible advantage over VB. Yet, there it is.

Python and Ruby are the next two. Again, this is more-or-less in the order I expected to see them. This is is the second tier of languages: really popular, but not in the same league as Java or one of the innumerable C variants.

After this, they list Objective-C as number 11. This language is tied to Apple's iOS and MacOS platforms, so it's popularity (like C# and VB) is driven in part by platform popularity.

Third Tier

Once we get past the top 10 Java/C/C++/C#/Objective C and PHP/Python/Perl/Ruby/Javascript tier, we get into a third realm of languages that are less popular, but still garnering a large community of users.

ActionScript. A little bit surprising. But -- really -- it fills the same client-side niche as JavaScript, so this makes sense. Further, almost all ActionScript-powered pages will also have a little bit of JavaScript to help launch things smoothly.

Now we're into interesting -- "perhaps I should learn this next" -- languages: Groovy, Go, Scala, Erlang, Clojure and F#. Notable by their absence are Haskell, Lua and Lisp. These seem like languages to learn in order to grab the good ideas that make them both popular and distinctive from Java or Python.

Saturday, January 2, 2010

Building Skills in Object-Oriented Design

Completely revised Building Skills in Object-Oriented Design. Cleaned up many of the exercises to make them simpler and more sensible to the n00b designer.

Also, to make it easier to follow, I made use of the Sphinx "ifconfig" feature to separate the text into two parallel editions: a Python edition and a Java edition. A little language-specific focus may help.

Interestingly, I got an email recently from someone who wanted the entire source code for the projects described in the book. I was a little puzzled. Providing the source would completely defeat the purpose of the book, which is to build skills by actually doing the work. So, no, there is no source code available for these exercises. The point is to do the work, all the work, all by yourself.

Wednesday, October 14, 2009

Unit Testing in C

I haven't written new C code since the turn of the millennium. Since then it's been almost all Java and Python. Along with Java and Python come JUnit and Python's unittest module.

I've grown completely dependent on unit testing.

I'm looking at some C code, and I want a unit testing framework. For pure C, I can find things like CuTest and CUnit. The documentation makes them look kind of shabby. Until I remembered what a simplistic language C is. Considering what they're working with, they're actually very cool.

I found a helpful posting on C++ unit testing tools. It provided some insight into C++. But this application is pure C.

I'm interested in replacing the shell script in CuTest with a Python application that does the same basic job. That's -- perhaps -- a low-value add-on. Perhaps I should look at CUnit and stay away from replacing the CuTest shell script with something a bit easier to maintain.

Monday, July 27, 2009

Programming Language Popularity -- Update

I used to rely on the TIOBE Software Programming Community Index.

Today, I learned about the langpop site. The context was following an SD Times blog posting on Haskell. But I got distracted looking at language rankings and what it means to consulting companies.

[Until corrected, I was] particularly drawn to langpop's Amazon listing. It seems like everyone wants books on C, C++ and C#. All other languages are also-rans. Why? [I guessed] that it's because those languages are so (a) hard to work with and (b) are perceived as "old school" where print resources are more prevalent than on-line resources. [Turns out the rankings are screwed up. Dang.]

Marketplace

The langpop aggregate ranking is C, Java, C++, PHP, JavaScript, C# and Python. The TIOBE Community Index is Java, C, C++, PHP, VB, C#, Python.

Similar enough to confirm that a half-dozen languages dominate with two others fighting for position.

Years ago, one of our senior consultants made the case that we really have three fundamental tech-stacks into which our web services had to fit: VB/.Net, C#/.Net, and Java. Corporate IT doesn't demand a lot of PHP or Python from us.

Perhaps those are opportunities for growth. Or perhaps there isn't enough corporate IT demand for those frameworks.

Thursday, July 23, 2009

Java vs. PL/SQL

Quite a while ago, I compared Java and PL/SQL to gauge their relative performance.

Recently (okay, back in mid-June) I got this request.
One thing I would like to compare is Java vs PL/SQL using native
compilation (search Oracle for NCOMP). Would you be willing to repeat
your benchmark tests using NCOMP? NCOMP is pretty straightforward to
set up, I think it is even easier in 11g, if you are using that.

Also, when you test Java vs. PL/SQL, are you using Java stored
procedures in Oracle, or are you using an external VM and connecting
to Oracle? (One annoying limitation to Java Stored Procedures is the
lack of threading ability, among a few other things).
Native Compilation will not make PL/SQL magically faster than Java. The very best it can do is make PL/SQL as fast as Java. The clunky, inelegance of PL/SQL isn't fixed by NCOMP, either.

My test was PL/SQL stored procedures in Oracle. These were compared against Java programs in a separate JVM. I didn't use Java stored procedures because the client didn't ask for this.

The client had legacy C code they wanted reverse engineered and reimplemented. PL/SQL was unsuitable for this task for a number of reasons.
  1. PL/SQL is slower than C or Java. Speed mattered.
  2. PL/SQL is a clunky and inelegant language. Worse than C and far worse than Java. The application would have grown to gargantuan proportions.
  3. The legacy C code was full of constructs that would have to be rethought from their very essence to recast them in PL/SQL. Java, for the most part, is a better fit with legacy C. The reverse-engineering was -- relatively -- easy in moving from C to Java.
There were some additional, minor considerations.
  1. There is some unit testing capability in PL/SQL (UTPLSQL), but it's not as feature-rich as JUnit. Unit testing was essential for proving that the legacy features were ported correctly.
  2. PL/SQL is hard to develop. A nice IDE (like NetBeans or Eclipse) makes it very easy to write Java. The customer was using Toad and wasn't planning to introduce the kind of IDE required to build large, complex applications.
In short, the simple speed test -- PL/SQL vs. Java -- was sufficient to show that PL/SQL is simply too slow for compute-intensive speed-critical applications.

Saturday, July 11, 2009

Flying Saucer

The old code was 5700 lines of bad VB.

The new code is Velocity, Flying Saucer, iText and 120 lines of glue. The old code will be replaced with perhaps 500 lines of XHTML-producing Velocity templates.

[The Flying Saucer site -- with the main menu on the right -- was confusing at first. It made it look like a half-baked semi-functional idea for an open source project. Boy was I wrong. It totally rocks!]

I am absolutely delighted at the FS world-view.
  • Process the entire CSS specification -- every feature -- especially those related to printing.
  • Don't tolerate malformed XHTML. Gecko tolerates all kinds of HTML problems, making it big and sophisticated. Flying Saucer just doesn't tolerate ill-formed XML, making it simpler, and more able to handle every CSS nuance.
Since the document is very simple (with no side-bars or floating elements), simple CSS works. And the Flying Saucer PDF matches the HTML completely. The match was so good that I did a double-take at the first PDF I made.

The best part is being able to chuck 1000's of lines of VB and replace them with 100's of lines of XHTML. I think that could stand to reduce long-term maintenance costs.

Wednesday, May 27, 2009

That's odd -- Python faster than Java

Here's an amazing Stack Overflow question.  The follow-up conversation is great stuff.

The question shows two versions of approximately the same processing.  Python is faster than Java.  That's unexpected.

Java has static compilation and the hot-spot translation to machine code.  Apparently, Python has some optimizations that are just as valuable.

Thursday, May 21, 2009

Applet Not Inited; the "Red X" problem

I haven't done Applet stuff in years.

I do -- intensely -- like embedding functionality in web pages.  RIA/Ajax and what-not are something I have trouble with because I'm not a graphic designer.  Javascript and Applets fall into three clear categories:
  1. Basic usability.  Javascript offers lots of little enhancements to HTML presentation that make sense.  Emphasis on "little".
  2. Client-side features.  Many things are simple calculators or other processing that makes sense on the client side -- the relevant factors can be downloaded and used by an applet or javascript script.  
  3. Junk.  There are lots of graphical effects that vary from gratuitous to irritating.  Too many folks in marketing see some "pop-up" technique and think it's cool.  Worse, they'll take an application that lacks solid use cases and try to add flashing to scrolling to emphasize something instead of reducing clutter and distraction.  Sigh.
The Common Problems

In all web-based software development, the number one problem is always permissions.  Always.  In the case of applet development, this is always hurdle number one.  The file isn't owned by the right person or doesn't have the right permissions.  You see the "applet not inited" and "red X icon" as symptoms of the applet not being downloaded at all.

The number two problem is access to resources.  Usually this is a CLASSPATH issue, but it can also be an HTML page with a wrong URI for the applet's code.  You see the "applet not inited" and "red X icon" as symptoms of the applet not being referenced correctly, or not being able to locate all of its parts.

[Technically, the basic access comes before permissions, but you usually don't get access wrong first.  Usually, you get permissions wrong; later, you discover you have a subtle access issue.]

One of the more subtle manifestations are the case-matching issues.  Your Java class definitions are usually UpperCase.  The source file and resulting class file will have this same UpperCase format.  But if you get the case wrong in your HTML, you just get an applet not inited error.  Arrgh.

When you don't work with applets all that often, the "applet not inited" is baffling.

Misdirection

I wasted hours on Google and Stack Overflow looking up "applet not inited" and "Red X icon" and similar stuff.

Then I looked at the HTML I was testing.

Surprise.  No one had moved the .jar file into the proper directory.

There's a lot of stuff on the applet not inited error.  Most of it misses the usual culprits: permissions and access to the resources.