I just ran across an article by Steven G. Bloom about Iowa and how it is ridiculous for it to play such an influencial part in picking a president. There have been a lot of articles by Iowans saying how inaccurate the article is. But they don't actually address the main point in his article, that Iowa is very different from the rest of the country and that it should not be so influencial in choosing a president. So, what states are typical? Does he think that California is it? New York? I lived in upstate New York for seven years, but have never lived in California. However, it seems to me that most people outside of New York or California do not think of either of them as typical.
I live in Illinois. Illinois has two parts, Chicago and "down state". Down state has some decent sized cities; Peoria, Springfield, Champaign-Urbana (where I am from), Bloomington-Normal. But down state is pretty similar to Iowa. The biggest difference between the states other than Illinios having Chicago is that Iowa has way better politicians. Like nearly every other state. (Sigh!)
Indiana is to our east. In spite of Indianapolis, it is a lot like Iowa, too. Ohio has a bunch of big cities, but in between them it is a lot like Iowa. Wisconsin, Minnesota, North and South Dakota, Oklahoma, Nebrasks, Missouri. Even western Pennsylvania. My daughter lived in Kentucky for awhile, and it isn't that different. The rural south is different in some ways from the rural midwest. Rural Mississippi is heavily black, for example. But they like to hunt and are more religious than average for Americans, two of the prominent features of Bloom's Iowa. Rural people in Texas and Colorado are more likely to own horses than hunting dogs, but the spirit of rural people is pretty similar around the country.
The red vs. blue divide in the US is mostly rural versus urban. It is pretty evenly divided, which is why there is so much political struggle.
I claim that Bloom's main point is wrong. Iowa is typical. The fact that Bloom can say that means he hasn't much of an idea of what the US is like. I'd still like to know what he thinks is typical. I think it would be illuminating.
by Ralph Johnson (johnson@cs.uiuc.edu) at December 18, 2011 12:49 AM
A story of open source before open source was cool. Agile development in the 70s. How VistA helped the VA provide the "best health care anywhere". That is a talk at OSCON, by the way.
by Ralph Johnson (johnson@cs.uiuc.edu) at December 17, 2011 12:43 PM
One of Christopher Alexanders underappreciated ideas is that of a pattern sequence. A pattern sequence is a set of patterns that are often applied in a particular sequence, a "pattern of patterns", as it were.
James Coplien has been trying to promote the idea of pattern sequences for some time. He and Neil Harrison wrote a very short paper about them and he wrote a longer paper with Ronald Porter and Tiffany Winn. Although I haven't written much about sequences, I agree that they are important. Kent Beck and I wrote a paper on the patterns behind HotDraw some years ago that was called Patterns Generate Architectures. It shows a pattern sequence that produces a graphical drawing editor. We had hoped at the time that other people would describe their designs as sequences of patterns, and a few people did, but not as many people as I wish.
I have been thinking about pattern sequences recently as a way of organizing design patterns. In particular, I think the "creational patterns" from Design Patterns show several simple pattern sequences. The first step of these is always Factory Method, which can then be followed by either Abstract Factory, Builder or Prototype. Abstract Factory and Builder can be combined; either a design with Abstract Factory can be replaced with one with Builder, or (as the book says) a Builder can be implemented with an Abstract Factory. The book says a builder can be implemented with Prototype, which is plausible, but I don't remember ever seeing one in the wild. So, the standard sequences are Factory Method followed by Abtract Factory and then Builder, Factory Method followed by Builder, and Factory Method followed by Prototype.
Factory Method is such a trivial pattern that sometimes I wonder why we put it in the book. It is basically the same as Template Method, but seen from the point of view of the method that the template method calls. However, the fact that it is the start of these (very short) pattern sequences makes it much more interesting. So, I have decided that I am happy including it with the others.
The point of Factory Method is to avoid hard-coding class names when you are constructing objects. Instead of saying "upButton := Button new" you should say "upButton := self createButton" and hide the reference to the Button class in the createButton method. The advantage of Factory Method comes when you want to change the class of buttons. You can make a subclass of whatever is making the buttons and override the createButton method. On the other hand, if you never need to change the class of your buttons then it is pointless to use Factory Method. Moreover, if you have to change the class of your buttons too often then Factory Method is also a bad idea, because it leads you to make too many subclasses. So, I tend to start with a design in which class names for constructing objects are hard-coded. The first time I want to change them, I change the design to use Factory Method. If I see that I am building a lot of subclasses, I switch to one of the other patterns.
One of the other patterns that I often use in Smalltalk is to use a class as a factory and to pass it in to the object that is creating instances. I call this a "Factory Object" pattern, with the class as a factory. But if I want a single object to create a lot of different kinds of objects (as would happen if my application needed not only to change the class of Button but also of TextField and DropDownSelection and RadioButton and ...) then I'd use Abtract Factory, which I think of as another special case of Factory Object. It is easy to convert a design that uses Factory Method into a design that used Abstract Factory; just move all the individual "create" methods into an instance of a new factory class, and then pass the factory into the object that needs objects to be created. If you use Abstract Factory then the object doing the creating will work with a factory and an object to be created. But if you shift that design to Builder then the object doing the creating will work only with a builder; the builder will hide the object being created until the object is finished. Builder is another kind of "factory object". And Prototype is a fourth kind of factory object, one in which you can use any instance as a factory for more instances by asking it to clone itself. But in all cases, a factory object is a little more complex and powerful than factory method, and you should only suffer the extra complexity if you really need the power.
What other pattern sequences do you use a lot? Do you know of papers that describe systems as a sequence of patterns?
by Ralph Johnson (johnson@cs.uiuc.edu) at December 12, 2011 07:54 PM
Tom Munnecke posted a video of conversation we had about refactoring, design, and VistA. I agreed with most of what he had to say, but at the time of the conversation I was uncomfortable about what he said about decomposition and how he used the story of the blind men and the elephant. If you don't know that story, see the poem that introduced it to the west, or the Wikipedia article. I use the story when I teach OO programming. I give three different views of OO programming, and say that they are all true, but each view is limited. OO is a culture, and no description of a culture can be complete.
Science (and western thought in general) is based on reductionism, on making models, on abstracting. It is a very powerful tool. But patterns are very much about balancing forces, about appreciating the disadvantages of a technique as well as the advantages. The story of the blind men and the elephant is useful because it warns us of the dangers of reductionism. The story seems to have been told originally to warn against religious claims to exclusive truth. But Tom used it (and I do too) against a more broader narrow-mindedness.
Most people do not want to be narrow-minded. We want to be "right-minded", i.e. to see things correctly. When you tell someone they are narrow minded, they tend to get mad at you. But if you say "you are forgetting this part of the problem" then they are more likely to actually think about it. They might end up agreeing or disagreeing, but they will at least talk about it with you. We all know it is easy to miss things, that we only see part of the situation.
One of Tom's metaphores is a cat; once you disect a cat then you can't put it together again. It is not a good metaphore for understanding, because we could draw blood samples from a cat or take X-rays of it without killing it. Blood samples are only a tiny part of a cat, and only show us a fraction of what it is, but sometimes a veterinarian can learn something important that way. However, it is a great metaphore about making changes to a system.
I think that Tom uses the cat metaphore because he is concerned that the VA has made large changes to VistA without understanding it, and these changes involve cutting out parts of it without realizing that the parts that are cut out are deeply connected to the others. From talking with him, I can see three things that he thinks people have ignored.
1) Tom sees VistA as much more than just a piece of software. VistA is a community, a process for working with the software, a knowledge base. One of his original purposes of building VistA was to create a community, and he not only succeeded but is proud of how he succeeded. So, it pains him to see people try to reuse the software but at the same time ignore the community, and perhaps even supress it.
2) VistA was developed in a bottom-up fashion. Each hospital had a few programmers, and each programmer worked closely with doctors. There was no overall architect or plan, other than improving the health of veterans. This process undoubtedly had some weaknesses, but it had tremendous benefits, as well. It resulted in a system that fits the needs of doctors. It is a major reason why VistA has been so successful, and the VA has worked hard for 15 years to try to have a more top-down development style, and has been impressively unsuccessful.
3) VistA is driven by metadata. Metadata means different things to different people, and VistA has a variety of kinds of metadata. Sometimes it means the data that indicates the provenance of the real data, and VistA has that. But the kind that interests me the most and that I think interests Tom the most is configuration data that controls the behavior of the system. Most hospitals that use VistA have a cadre of medical experts (usually trained as doctors or nurses) who configure VistA. Most of the work of configuring VistA to a hospital does not require programming in Mumps, but just changing the metadata. This means that the details of VistA are under the control of medical experts (who have to become VistA experts before they can exert this control) and not programmers. Tom thinks that VistA should have a lot more metadata, and that there are a variety of things in VistA that now require programming that ought to be done by the experts who configure metadata.
These three aspects of VistA (and I expect there are more; these are just the three I can recognize) contribute to making VistA such a good fit for healthcare. Healthcare varies enormously, and it is hard for one piece of software to fit everywhere. The second and third aspects are part of what makes VistA so configurable; the first aspect helps people to learn how to configure it.
This description of VistA is just the parts of the elephant that I have felt. It is a complex system and perhaps is not knowable by any single person, no more than a city can be completely known or the internet can be known. Nevertheless, I know Champaign pretty well and I know Chicago better than I know New York City, though I know New York City better now than I did a year ago. I look forward to getting to know VistA better.
by Ralph Johnson (johnson@cs.uiuc.edu) at November 02, 2011 12:00 PM
This is a PhD thesis on applying design patterns (among other things) to building Technology Enhanced Mathematics Education, i.e. computer systems for helping teach mathematics. These patterns are more closely related to "pedagogical patterns" than to software design patterns, but the thesis is probably interesting to others doing research in patterns.
http://www.yishaymor.org/phd
by Ralph Johnson (johnson@cs.uiuc.edu) at August 06, 2011 12:22 AM
I remember a bit of excitement about Naked Objects a few years ago, and I thought it was a promising approach. I haven't heard much recently, and wondered what had happened to it.
Check out the 2004 PhD thesis by Richard Pawson, which is easier to read than most theses. I think it is an important idea. The goal is making it easier to get to the heart of the users problem domain, and to be less distracted by technical frameworks.
And now the Naked Objects has become an Apache project!
It will be good to see people use this more.
by Ralph Johnson (johnson@cs.uiuc.edu) at July 07, 2011 03:49 PM
I just found a good article on what makes VistA good. VistA was developed in the 70s and 80s using an open source development model. In the last two decades, the VA has often tried to develop it using a more commercial top-down model, and has repeatedly failed. What makes VistA good is that programmers pair-program with clinicians. They work closely to make sure that the system does exactly what the hospital needs. That is the process that led to a system that has won all kinds of awards, and it is not an accident.
by Ralph Johnson (johnson@cs.uiuc.edu) at June 29, 2011 10:25 AM
I'm spending the summer in New York (Westchester county) working on a framework for business transaction processing. Well, I'm really working to help bring "Liquidity Manager" to the point where we can try it out on the first customer, but that is the same thing. The company is called "Metaficient" and it is a typical startup with a few smart and dedicated people working hard to bring their first product to market. It is fun. The hardest part for me is being away from home for so long, but my son Caleb is out here with me, sort of bringing a piece of home with me.
The point of this note is to let people in NY know that I am here. About a year ago I exchanged e-mail with someone about contacting him if I came out this way, but I can't find it now. I recall he read my blog. And there are probably lots of people out here that I know. I've got a lot to do this summer, but I wouldn't mind getting away every now and then to talk to interesting people.
by Ralph Johnson (johnson@cs.uiuc.edu) at June 21, 2011 09:34 AM
A group of my students have developed an Eclipse plugin called CodingSpectator that collects information about how programmers use Eclipse They are collecting this data for research on how to improve Eclipse and for understanding in general how programmers use Eclipse. They are looking at a variety of topics, from refactoring to version control. CodingSpectator is easy to install and fairly unobtrusive. We need more people to sign up and help us out.
The biggest reason NOT to sign up (other than laziness) is that it ships information about your program to us. If you are working on code that you must keep secret then you probably shouldn't use it. We promise to keep your information confidential, but if you have promised someone else not to reveal any of the code then that is probably not good enough. Open source projects are no problem, of course. In fact, even if you want to keep your code confidential for awhile, perhaps because you haven't figured out licensing issues, that is no problem. We can even sign an NDA if that will help.
But for open source projects and most academic projects, there is no good reason not to use CodingSpectator, and it will help us a lot if you do. See our web site at http://codingspectator.cs.illinois.edu/ So, if you are an Eclipse developer, please sign up!
by Ralph Johnson (johnson@cs.uiuc.edu) at June 12, 2011 07:01 PM
Someone in a class I'm teaching wanted me to define double dispatch more carefully, so I thought I would do it here.
There is a Wikipedia article on double-dispatch. It is accurate, but not as clear as many Wikipedia articles. So, I thought I should try my hand at it.
Message-sending in Smalltalk chooses a method to execute based on the class of the receiver. Languages like Java and C++ are similar; they choose a method at run-time based on the class of a single object. Both Java and C++ seem to be more powerful because their static type system lets them overload functions, i.e. provide several functions with the same name but with different types of arguments. But this is not as powerful as double-dispatch, because double-dispatch lets a program dynamically choose a method based on the class of two objects (both the receiver and an argument) while overloading must be done at compile time. There are some languages that implement "multiple dispatch" or "multimethods" (I'm learning Groovy now, which is one of them) but the classic OO languages are based on single-dispatch and so double-dispatch is a programming technique or patterns.
Double-dispatch is a special case of multiple-dispatch. The purpose of double-dispatch is to choose a method based on the class of two objects, instead of just one. Maybe you want to choose a method based on three objects, or four. The technique I will describe can be generalized to any number of objects, but I am not going to explain that. If you understand how to do it with two, you can easily generalize to three or four. And double-dispatch really isn't that frequent, while dispatching on three or four is very rare.
The pattern is called "double-dispatch" because there are two message sends. Consider the + method. VisualWorks implements it with double-dispatching, but Squeak does not. The reason why arithmetic is a good candidate for double-dispatching is that adding a float and an integer should be the same as adding an integer and a float, but is quite different from adding two integers. You can add integers to fractions, or to complex numbers, or to polynomials (if you implement polynomials!) and in fact you can add any combination of them. You can multiply an integer and a matrix, but you can't add them. You can multiply an integer with a Money object, and you can add Money objects, but you can't add an integer to a Money object. Addition is more complicated that it seems; the behavior of + depends on both the class of the receiver and the class of its argument.
Double-dispatch has two dispatches to find the method that does the work. The first is on @+, of roucrse. Consider the expression "1.4 + 3", which involves sending a message to a Float with a SmallInteger as an argument. If addition were implemented by double-dispatch, the + method in Float would be something like
+ aNumber
^aNumber sumFromFloat: self
The message #sumFromFloat: is the second dispatch. Note that its name encodes the type of its argument. In general, it is bad style to encode the types of arguments in method names. Instead, method names should encode the role that the object plays. But patterns often involve breaking general rules in particular situations, and that is true here. Because the purpose of double-dispatching is to select a method based on the class of two objects, it is reasonable that it should eventually use a method that encodes the class of one of them. And the second method in double-dispatch ALWAYS encodes the type of the original receiver, which is now the first argument. Since double-dispatch is about finding a method based on the classes of two objects, there are always two important objects, the original receiver and a "descriminator". The first method in double-dispatch reverses the objects; it sends a message to the descriminator with the original receiver as an argument, and it encodes the class of the original receiver in the message. In Smalltalk, this means that the name of the message includes the class, while in a language with function overloading it is possible to use the same name for all the methods by having the type of the argument specify the particular class. But it is really all the same in terms of its impact on the structure of your program and on what happens at run-time. Smalltalk makes it easier to see what is really happening.
If you are trying to implement +, the first dispatch is on +, and the method for + will send a second message, which encodes the class of the original receiver. Thus, the second dispatch is on the class of the descriminator. The final method, the one that does the work, is in the class of the descriminator. Suppose that you are implementing + for Floats, SmallIntegers, LargePositiveIntegers, LargeNegativeIntegers, Fractions, and Complex. Then there would be six + methods, one for each class, and each would send a different second message, e.g., Fraction would send sumFromFraction: and Complex would send sumFromComplex:. That means that each class has to implement the six "double-dispatch" methods. In other words, there would be 36 methods handling the various combinations of classes.
In practice, it usually isn't this bad, since you can often use inheritance to reduce the number of methods For example, sumFromFraction: will do the same thing for any integer, so the three subclasses of Integer could just inherit a common method. But in the worst case, double-dispatching for N classes will result in N*N methods. But according to the original problem statement, the method depends on the classes of two objects, so the problem inherently needs all that code. The complexity comes from the problem, and the solution just reflects it. The nice thing about double-dispatching is that the code tends to be regular, and the final methods focus on a single subproblem and so tend to be simple. Once you learn the pattern, it is pretty easy to navigate the code and to add a new class. Yes, adding a class usually means adding a method to existing classes. This can be a problem in C++, but it is not a problem in Smalltalk. Double-dispatching gets used even in static languages like C++ and it is easier and so more popular in dynamic languages like Smalltalk.
by Ralph Johnson (johnson@cs.uiuc.edu) at June 12, 2011 11:55 AM
Clay Shirky had a blog about the collapse of complex business models. He referred to a book by Joseph Tainter called "The Collapse of Complex Societies" that I put on my must-read list. The book describes why societies collapse, and implies that it is necessary. Clay says the same thing about the collapse of complex business models; there is no way for the old companies to adopt the new business model, so they will eventually collapse along with their old business model. Naturally, I thought about software. Is there any way to save complex software when it needs to change and can't? We can refactor it, throw parts away, reuse parts of it. But is that any different from what collapsing societies do?
by Ralph Johnson (johnson@cs.uiuc.edu) at April 11, 2011 12:36 PM
For the past few years, I have been working on patterns for parallel programming. One result has been a series of workshops, with associated papers, and a collection of patterns that we have been working on with people at Berkeley and elsewhere. We are having another workshop soon, ParaPLoP 2011, and people interested in patterns of parallelism should submit a paper.
Another result has been a book, Parallel Programming with Microsoft .NET: Design Patterns for Decomposition and Coordination on Multicore Architectures. This book targets experienced .NET programmers who are about to do their first bit of parallel programming. Consequently, it focuses on "easy parallelism" instead of wringing the most performance out. It uses the latest tools that come with VisualStudio 2010.
Parallel programming is about performance, so it is never trivial, but this book is a fairly gentle introduction. Visual Studio 2010 supports parallelism a lot better than earlier versions. If you program on the Microsoft platform and are interested in learning parallel programming then you should get this book. Note that you can download an electronic copy for free, or buy a paper copy.
I'm officially a co-author, but the other authors did a lot more work than I did. So, I don't feel like I am tooting my own horn when I say that it is a pretty good book.
by Ralph Johnson (johnson@cs.uiuc.edu) at March 31, 2011 01:59 PM
VistA has been discussed in the Linux world at http://lwn.net/Articles/433793/ under how to turn VistA into a real open source project.
Tom Munneke responded to it at http://munnecke.com/blog/?p=1055
I found an interesting VistA blog by Rick Marshall who uses Christopher Alexander's Oregon Experiment to explain the development of VistA. This might explain why VistA is so successful in spite using a lot of off-putting technology. http://vistaexpertise.blogspot.com/
One of the things he said is :
Master plans prevent that dance of solutions with problems in four ways.
First, they emphasize the big picture over the details, wasting energy on prophecy that should have been spent on understanding the problems better, so the necessary details of the solution are missing or vague.
Second, what details they do specify are shaped by design to create (i.e., to serve) the totality, not by discovery to actually solve the problems, so they tend to be out of sync with the reality of the problems (i.e., wrong).
Third, they rigidly lock in those details, preventing them from moving with the problems as the problems shift, soon making any accurate details outdated and incorrect.
Fourth, the nature and interactions of the problems we aim to solve are far too complex to capture in a plan to begin with, ensuring that no matter how much effort goes into a medical-informatics master plan, its worldview always ends up being a ridiculous cartoon of the situation it is meant to deal with.
Any one of these characteristics would be enough to cripple a master plan. Together, they create irresistible forces that cause all VISTA master plans to converge on the same brief lifecycle of (1) ballyhoo, (2) bogging down, and (3) breakdown. And yet the inevitability of failure evidently can't compete with the intoxicating feeling of control a master plan creates, judging by the relentless parade of VISTA-replacement (or "modernization") boondoggles. It would be tragic if after fifteen years it weren't so drearily risible.
Still, at least we can serve as an object lesson to validate Alexander's point:
Master plans fit the shape of the problems with too little precision, and so they fail.
by Ralph Johnson (johnson@cs.uiuc.edu) at March 30, 2011 02:00 PM
Ted O'Neill was one of the founders of VistA, and he just passed away.
http://munnecke.com/blog/?p=1037
It makes a facinating story!
by Ralph Johnson (johnson@cs.uiuc.edu) at March 12, 2011 09:46 PM
It looks like the conference on aspect oriented software development is renaming itself to "Modularity". I thought that was interesting.
Modularity is an important idea in software, but it means different things to different people.
One of the most important papers on modularity is the one by David Parnas, "On the Criteria to be Used in Decomposing Systems into Modules". This was the paper that introduced the term "information hiding", where it meant that a module should hide decisign decisions. In this context, the purpose of modularity is to break a system down into units that hide design decisions, making it easier to change them later. This implies that a module will hide data representation, providing an argument for abstract data types and object-oriented programming.
Another purpose of modularity is reuse. People often say "components" instead of "modules", but the basic idea is the same. Boh benefit from high cohesion and low coupling. Modularity is a broad topic, component just narrows it a bit.
If you are trying to build a fault-tolerant computing system, you need to divide your program into pieces that run on different computers so that if one of the computers fails, the others can keep on going. Sometime the pieces are called processes. I imagine there are a variety of names for this concept. But they are modules, of course. Just another kind.
If you are trying to make a secure system, you need to "compartmentalize" your system. A compartment is a unit of your system such that if someone breaks into one compartment, they do not automatically get access to another. You can implement compartments with separate address spaces on your computer ("process" in Unix terms) that send messages back and forth ("pipes") and do not share memory. You need to restrict communication because the more powerful the communication, the easier it is to break into the compartment. In contrast, the fault-tolerant computing kind of module needs separate computers, with separate power supplied, maybe in different buildings (in case of fire) or different cities (in case of nuclear attack). The fault-tolerant computing kind of module isn't effected as much by a powerful communication channel.
Privacy is a lot like security. I have my private data, and I will share some of it with you, but only if you promise not to share it with others. There is a notation of privacy regions, and I want to restrict my data to certain regions.
Parallel programming has processes, which aren't very modular. You can break up your computation in a more modular way by using patterns like divide and conquor. You can break up your data into units, too, perhaps in a grid or an array. But modularity doesn't seem to be as emphasized in parallel programming as elsehwere, which might be part of the reason that parallel programming is hard.
So, what kind of modularity have I missed? I know some of these ideas have other names; can you give examples? Do you know anybody who has explored this whole topic before?
by Ralph Johnson (johnson@cs.uiuc.edu) at March 08, 2011 02:46 PM
For the past week I have been looking at a system developed by the Department of Veterans Affairs for medical records called VistA. By some measures, it is the best system for medical records in the world. On the plus side, the physicians like it; it captures all the information they need and they are able to quickly find out everything need about a patient. The VA has something like 9 million people in this database, and they have been able to use it to discover that certain treatments are better than others. it is an excellent system for recording and analyzing medical outcomes. The VA claims it has saved them lots of money, and they have won many awards for it. On the other hand, it doesn't do billing (which is important for most hospitals outside of the government) and the software is old and is hard to change.
I found out about it because we had a visitor from the VA last week who talked about it and who said that the VA is considering making VistA open source in an attempt to revitalize and rejuvinate it. He talked about refactoring this 14 million like program, and that made me interested. So, I started investigating, and I found out lots of interesting things.
The first thing I found out is that VistA is already open source. The VA is required to release all their software under the Freedom of Information Act, this is usually called FOIA VistA. It is public domain, but it takes more than that to make software open source. There are two main communities that have taken one of these versions and then started hacking on it. They are called WorldVista and OpenVista (the latter is associated with a company called ). Both communities keep up with changes in the FOIA VistA and changes migrate back and forth between these two systems, but I am not sure how well.
SEi studied the VistA open source ecosystem and wrote a report about what is needed to make it more healthy. One of the things that needs to change is that the VA needs to quit ignoring the rest of the world.
http://conway.isri.cmu.edu/~jdh/web-pubs/pdfs/vista_report_2010_final-formatted.pdf
Not too surprisingly, VistA started out as an open source style project. Each hospital had a few programmers working closely with a few doctors, and they gradually developed the infrastructure and the applications. It was all underground, and in fact that VA tried to kill it for awhile, then in the early 80s decided to support it, and then when it started winning awards got to be proud of it. But it is not clear that management has ever understood it. See the story at http://hardhats.org/history/hardhats.html
VistA is a three-tier client server system. It started out as a command-line system (typical of the 70s) with two layers. The bottom layer was a language called MUMPS and a few subsystems written in MUMPS, one of which, FILEMAN, is an interesting DBMS. Then a bunch of applications were built on top of it. They are called "packages". VistA became 3-tier in the 80 when a client component writen in Borland Delphi was added. It only runs on Windows, but gives the GUI that doctors like. Programmers tend to live in the command-driven world of the original VistA server, which runs on lots of platforms, often Linux but actually just about anything.
http://vistapedia.net/index.php?title=How_does_VistA_work
http://vistapedia.net/index.php?title=What_is_VistA_Really
Our nation is having a health crisis. Part of this is due to the lack of standardization of health records. It is amszing to me that something as good as VistA is open source, but it is not being used any more than it is. It has a potential to help make healthcare cheaper and better.
by Ralph Johnson (johnson@cs.uiuc.edu) at March 07, 2011 03:13 PM
Aloha! I’d like to invite you to the 33rd International Conference on Software Engineering, to be held on May 21-28. I’m gonna be there along with my friend Fredrik, Danny and Marc, presenting our Research Paper “Refactoring for Immutability“.
The International Conference on Software Engineering (ICSE) is the premier software engineering conference, providing a forum for researchers, practitioners and educators to present and discuss the most recent innovations, trends, experiences and concerns in the field of software engineering.
The ICSE 2011 theme of Software by Design reflects the widely-held view that the most important ingredient in ensuring a software system’s long-term success is its design. While other concepts, concerns, activities, artifacts, and processes are important to the success of a software engineering project, it is the quality of a system’s design that provides the critical ingredient. Highlighting design as the conference’s theme is meant to focus attention upon design’s centrality, to encourage reflection on design experience, to inspire new designers, and to encourage development of new design techniques.
See you in Oahu, Hawaii!
by gacevedo at February 03, 2011 05:28 PM
Mike Ernst has a large collection of advice to undergrads, grad students and even professors. He wrote some of it, but most of it he collected from others. You can find it at http://www.cs.washington.edu/homes/mernst/advice/
by Ralph Johnson (johnson@cs.uiuc.edu) at September 30, 2010 12:30 PM
There is a new call to action called Software Engineering Method and Theory with a large number of signatories. Wednesdays are when Darko Marinov and I and a few of our students usually meet for lunch and talk about software engineering issues. Yesterday we decided to talk about SEMAT and we had about a dozen people come, much more than usual.
First, I think it is great that people are acknowledging there is a problem and want to do something about it. The state of the practice in software development is pretty dismal. Some groups do a great job, but most do not. As I tell the students in my software engineering course, if you manage requirements, make sure the developers talk to each other, release working code regularly, have some sort of a systematic testing process, use build and version control tools, and periodically stop and see how you are doing and how you can improve, you will be better than 90% of the groups out there. Of course, I could be exaggerating. Maybe it is only better than 75%.
The call to actions states:
Software engineering is gravely hampered today by immature practices. Specific problems include:
I support completely the second, fourth and fifth points. I think the first and third are symptoms, not causes, and fairly unimportant symptoms.
One of the things that makes me worry is that when academics say "theory" they mean math, but that is not necessarily what people in industry mean. I was very happy to see that Alistair Cockburn was a signatory, because in my opinion he is one of the best theory-developers for software engineering. Even though he has a PhD, he definitely doesn't act like an academic. He is a consultant and firmly in the industry camp. His theories have no math in them, but are instead more like theories in sociology or anthropology. That is because he focuses on people more than product.
Part of the problem is defining software engineering. Do they mean all software development, or only software development practiced by people with software engineering degrees? Or with software engineering in their titles? What about the typical bank or insurance company, which has software development groups whose managers have little software development background, and where the average developer does not even have a CS degree, must less formal training in software engineering? I think the main problem with software development is that most groups do not do the things that we have known for 30 years to be important. Most people who manage software groups do not really understand software development, so they are incapable of doing a good job.
I think that the theory that all software developers need is the kind that Alistair teaches. In addition to the theory, we need a set of standard base-line practices. We need training programs to ensure that the theory and practices are known. I think we should take a broad view of software engineering. Anybody who is developing software for somebody else should have a certain set of minimal standards. Certainly what NASA and Microsoft does counts as software engineering, but they are doing a better than average job and are not the ones that most need s widely accepted theory. The biggest need is with the bottom half of development groups.
by Ralph Johnson (johnson@cs.uiuc.edu) at December 03, 2009 12:37 PM
A nice description of refactorings in Eclipse. I'm putting it here so I can find it later when I want to give it to someone.
by Ralph Johnson (johnson@cs.uiuc.edu) at November 27, 2009 09:26 PM
http://www.kickstarter.com/projects/nickd/cadence-and-slang-is-a-book-about-interaction-design
This is an interesting reference to Christopher Alexander, an interesting idea for fund-raising, and what could be an interesting book. I pledged $40 for the book and I hope to see a copy!
-Ralph
by Ralph Johnson (johnson@cs.uiuc.edu) at November 24, 2009 04:59 PM
Nick Chen, one of my grad students, wrote a very interesting blog about parallelizing matrix multiplication. His main point is that you can get better performance from low-level optimizations than you can from parallelization, and that a lot of people are parallelizing the naive, unoptimized version which means that their results are meaningless. In general, MM is a lousy example for parallelization, since people should not be writing MM routines but should be using one of the many libraries that has it already optimized.
by Ralph Johnson (johnson@cs.uiuc.edu) at November 24, 2009 11:31 AM
It never fails that Murphy will make an appearance right when I am getting close to wrapping things up. We have bandwidth limitations on internet usage in Australia. Do you remember using dial-up modems? That's about what it feels like when you use all of your allocated bandwidth for the month. Unfortunately, it made it nearly impossible to access the UIUC wiki page to get the papers for the final reading assignment. Here is my lackluster, yet final, submission for this reading blog...
Branch-and-Bound
The branch & bound pattern is fairly well written. Although the use of the data flow diagram did assist in explaining the pattern, it is unfortunate there were no actual code examples to demonstrate the branch and bound operations of branching, evaluation, bounding, and pruning. It was beneficial we had previously discussed the task parallelism, data parallelism, shared queue, and data distribution as they were used to demonstrate the parallelization approaches. I do appreciate the authors not getting to deep into the noise of the technical aspects and kept the pattern fairly high level for a well-rounded audience.
Graphical Models
Does this pattern have anything to do with parallelism? Ok, I understand the idea of modeling the probability distribution, but I do comprehend how this is applied to everyday use. What is the true benefit here? The example, if you call it an example, provided no real insight into what the pattern was about. I wanted to read and comment on the pattern before listening to the recorded discussion. Now, I am interested to hear the follow-on discussion after reading my classmates' comments.
Structured Grid Computational Pattern
This pattern was co-authored by an UIUC undergrad alum who went on to UC-Berkely to complete her PhD. The idea of representing a multi-dimension problem in the form of structured grid implemented as an n-dimensional array is very interesting. I do not have any background or experience in this area. As the array is broken into chunks for parallel processing, the tasks cannot be too fine-grained resulting in additional overhead and reducing efficiency of parallelism. The Geometric Decomposition pattern is handy in this situation.
I am interested in looking at additional information related to "Ghost" nodes, double-buffering, and adaptive mesh refinement.
by James (noreply@blogger.com) at November 16, 2009 06:30 AM
PRES is a technique or system used for detecting, diagnosing, and reproducing a concurrency bug on multi-processors. I thought it was an interesting reading and was able to recognize similarities and differences between PRES and one our previous readings, CHESS.
In an ideal world, a programmer would love to reproduce a concurrency bug in a single replay and with no runtime overhead. PRES combines three ideas or techniques: (1) record only partial information to keep the replay close to the production run, (2) diagnose, process, and reconstruct unrecorded information to reproduce the bug, and (3) use feedback from unsuccessful replay runs to determine which paths do or do not produce the bug in question.
The various schemes of PRES are quite intriguing. I am not sure how thorough PRES is compared to CHESS. Even though CHESS is Microsoft-based, I would be interested in see a side-by-side comparison with identifying concurrency issues. Each system may be well on their way in building their own niche in the market. Considering the usefulness, efficiency, and feedback received from the concurrency debugging applications, I would definitely want to revisit this topic in a few years to see which approach reigns supreme.
by James (noreply@blogger.com) at November 16, 2009 06:29 AM
Shared Queue:
What experience have you had with shared queues, and what factors did you use to decide whether to use single vs. multiple queues?
+I do not have any experience with these. I have not had to do concurrent programming yet.
An enqueue or dequeue seems like a very small operation. Why is there so much concern over contention to the queue? What factors cause this?
+Although the act of enqueuing or dequeuing may seem quick or simple, there is obviously a drastically negative impact of having contention in these situations that could easily cause the queue to behave unexpectedly or enter an inconsistent state if locking is not done properly. You could also easily end up with a single queue being a severe bottleneck if the tasks are small, there are many workers, or both.
Seeing as how the authors lay out quite a bit of code to support the variations of these patterns, they seem to mitigate their own concern as to the value of "simple protocols.” Why would someone opt for the less efficient implementation, given that the concerns are already outlined, and sample code available?
+I cannot think of a good reason to use the simple protocols other than the fact that it may be easier for n00bs to debug.
Speculation:
Speculation seems like a straightforward pattern, though potentially a quite powerful one. What factors would you consider to decide if speculation is an appropriate/beneficial pattern to apply?
+The pattern boils down to optimistic parallelization. This reminds me of my computer arch class, 433, and our examination of branch prediction.
+You break sequential dependencies assuming that the majority of the time that breakage will not affect the correctness of the problem. In the cases where this does cause a problem, which should be a small percentage if you are applying this pattern correctly, the cost of going back and fixing those problems should be outweighed by the benefits gained from the optimistic parallelization.
+Regarding what factors need to be considered, I would like the ability either to know the statistical distributions of the inputs or to prototype a system in a reasonable amount of time to check the validity of apply this pattern to the given data or domain.
+I think the HTML lexer was extremely helpful edifying this pattern for me. While the abstract discussion all makes sense and seems well written, I just could not figure out what the tasks and dependencies could look like. I wish there were a few more examples that are concrete.
Seeing as how you may not have data about the statistical distribution of inputs to expect before you build the program, and therefore perhaps not know the average cost of rollback, etc., is it prudent to implement the speculation pattern at the outset, or wait to evolve the application to include speculative optimizations down the road? What if the performance increase that speculation has the potential to generate is necessary to meet the minimum requirements of the specification? Is speculation still a viable pattern to apply, or would effort be better spent on other parallel optimizations with more certain outcome?
+I do not think that it is prudent to implement this pattern if you have no concept of whether it will be successful or not. The same applies to almost any architectural or design decisions that you would make.
Circuits:
Was this pattern what you expected from the name?
+Nope. I am not sure how the name fits the pattern that well. Sure, we are operating on the bits as circuits in a CPU would, but I feel that I had to think about how the name & pattern related rather than it being something more obvious. Then again, it does attempt to create an analogy with something that we all know.
It seems like a great deal of this pattern is described through examples of clever transformations for specific problems. Do these examples give you enough information to go about applying this pattern to a problem dissimilar to the given examples? Can you think of a way of describing these transformations at a more abstract yet actionable level than the examples do?
+I could theoretically think of creative ways to map other problems onto this pattern, but this pattern seems like such an extreme way to implement a pattern, that I imagine it would be a last resort that would only be considered if all other parallelization attempts were inadequate or unsuccessful. Though I agree that using 32 or 64 bits to represent what only needs 1, I feel that you would really have to be trying to wring every idle cycle out of your systems for this pattern to be something that you would turn to.
by Mike (noreply@blogger.com) at November 13, 2009 03:48 PM
Branch-and-Bound computational patterns generally involve finding an optimal solution to a problem by iteratively searching through a large space of all possible solutions. Typically, to perform such a search, no known polynomial time algorithm is known. The Branch-and-Bound pattern comprises of four steps: Branching, Evaluation, Bounding, Pruning.
1. Examples: The authors discuss 3-SAT and Traveling Salesman Problem as examples. Yet, a more fundamental example to grasp is the knapsack problem. That said, what might be some other simpler examples of branch-and-bound that you are familiar with?
+One example that I will steal from Wikipedia is a chess program. The incredibly large search space of possible moves 1 to n moves ahead explodes at an exponential rate, and being able to prune the search space efficiently is key for the system to be able to identify the optimal or almost optimal move.
2. Trade-offs: How does the nature of branch-and-bound pattern make it especially hard to deal with incurring low communication overhead while maximizing concurrent computation?
+When branch finds an optimal solution or constraints to apply to boundaries, that must be communicated to the other branches in order to aid their pruning. This could potentially mean that every branch needs to communicate with every other branch, which in turn would lead to a significant amount of chatter. There is a direct relationship between the amount of communication and the pruning efficiency.
3. Solutions and Parallelization:
a. The authors discuss three possible techniques for parallelization: operation-based, structure-based, and construction-based. What makes structure-based parallelization preferable? Are there any scenarios where you would use the operation-based or construction-based technique instead?
+I think this question should be answered by the authors of the paper. I do not feel that they are giving the topic a thorough analysis if they simply skip operation- and construction-based patterns in favor of structure-based ones without much justification.
b. For the structure-based technique, the authors describe four different methods for implementation: Synchronous Single Pool, Asynchronous Single Pool, Synchronous Multiple Pool, and Asynchronous Multiple Pool. Which method do you think would be hardest to debug for correctness? Which method do you think would be hardest to optimize for performance and scalability?
+Correctness: AMP. This complicates the scenario both by distributing data and by having updates happen with no definite period.
+Optimize: ASP. This may be hard to optimize for efficiency because the shared task queue needs synchronized access. It is also complicated by the fact that each processor can request data at any time. If all processors were updated at once, synchronization on the queue would likely be easier because just one process could handle the global lock & coordination.
c. The authors discuss two categories of task assignment: static assignment and dynamic assignment. In real-world applications, how important do you think the choice of the task assignment method is to performance?
+I think it depends on the problem domain. Static assignment can likely be more efficient if tasks size can be known with a reasonable degree of accuracy a priori. That way, the overhead of load balancing does not need to be incurred. It thus follows that dynamic assignment is beneficial in the opposite case. Given that, I think that choosing static assignment in the wrong situations can severely hurt performance by causing many processors to sit idle, but I do not imagine that choosing dynamic assignment in the wrong cases would lead to much of a loss in efficiency.
4. Pattern Comparison: How might fork/join or task queue pattern relate or fit with the branch-and-bound pattern?
+Task queue is an easy one since it is explicitly called out in the paper as being useful as the central data store in the SSP & ASP styles of structure-based parallelism. As for fork-join, each branch can be a fork until you get down to a low threshold, at which point it is best to compute the result sequentially.
Graphical Models are often times used in probability theory to represent uncertainty of a collection of (possibly independent) events. Given the probability of occurrence of some events in the network, one can use the model to infer probabilities of other events within the network.
1. Examples: Did the authors' examples (e.g. hidden Markov Models, Bayesian Netoworks) help illuminate what a graphical model is? Can you think of other simple example problems or scenarios that you could model as a network/chain of probabilistic events?
+The examples made sense only because of my prior exposure to HMMs and Bayesian networks in CS 512. One example of how they could be used is determining how likely you are to see certain sequences of genes or proteins. If the authors wish to talk about HMMs and Bayesian networks, they should really give examples. Although many readers will be familiar with basic graph theory, these concepts go beyond that.
2. Solutions and Parallelization:
a. The authors make a distinction between exact and approximate estimation of probability. In what example problems do you think approximate estimation would work best? In what examples exact estimation?
+It depends on the definition of "work best." Exact algorithms would always be ideal, but a faster approximate algorithm can typically be found & be beneficial from a computational resources perspective. The example above of finding or predicting certain patterns in a genetic sequence is a good candidate for using an approximate algorithm over an exact one due to the immense number of nucleotides that need to be processed.
b. The authors mention the Junction Tree Algorithm, indicating that searching for the correct factorizations can be done in parallel. What other characteristics of graphical models do you think could benefit from parallelism?
+Not sure. The paper is a bit too short for me to really wrap my head around parallelization opportunities, and the mention of the JTA does not help since I am unfamiliar with the pattern.
3. Pattern Comparison: What might this pattern have in common with the Branch-and-Bound pattern? Do you think it has similarity to the Graph Algorithms patterns?
Not sure.
by Mike (noreply@blogger.com) at November 13, 2009 03:47 PM
This pattern is used for problems where you have a very large space and you need to search to make a decision or find an optimal solution. Of course the space is too large for enumerating every point.
In order to find the solution to the problem, this pattern has four operations: Branching, Evaluation, Bounding and Pruning.
This kind of problems are good for parallelism, even thou synchronization is complicated. You will want to reduce communication and increase computation.

This pattern gave me a bad impression, because I thought I was going to read about graphics rather than statistics and probabilities. Also as some classmates mentioned, it lacked of a good and clear example.
This pattern was better written I think, at least it had several examples (thou I missed source code!). The main idea is to decompose (using geometric decomposition) the problem into smaller arrays and perform operations over the arrays in parallel.
The double buffering and adaptive mesh refinement idea was very interesting to me and seems to be very useful for image processing.
by gacevedo at November 12, 2009 09:02 PM

I was happy that they put in a flowchart with this pattern. It went a long way towards making the process easier to understand. I was able to go back to the text description before it, read it again and follow the chart, which helped.
Seeing all of the tradeoffs for the various combinations of sync/async and single/multiple pool was interesting. I don’t think the example helped me understand it any more though. Now that I’ve read a lot of parallel patterns, and have had some exposure to the issues with parallelism, I was able to guess at the tradeoffs by their name, and was (mostly) right, which was nice.
This one is confusing. Not the pattern itself, but I’m confused as to why I’m not getting the bigger picture with this pattern. I’ve done half or so of the things in known uses, but this pattern doesn’t seem familiar. I’ve done a lot of estimation and modeling in my bioengineering work, so everything in the forces section was straightforward. It just didn’t click, and I’m not sure who’s fault that is.
This one was both straightforward and it clicked, unlike the last one. The partitioning reminded me of what was done in speculation, but without the speculation, of course. It was also clear how you’d parallelize this.
The ghost nodes were a new solution for me, though I’ve never tried to parallelize this before myself. The adaptive refinement is exactly like what you do when numerically solving things, so that made a lot of sense.
Their related patterns section both showed what it was similar to and not similar to, which is helpful.
Branch-and-Bound
-------------------------------------------------------------------------------
Branch and bound can be a useful pattern. I’ve actually seen this before but without the connotations of it being a parallel pattern. The place I’ve seen this is in Artificial Intelligence and game logic. For the game of chess the number of moves and response moves you can make is huge. When working with a zero sum game one common strategy is to create a branch of the moves and find out the costs for them. It isn’t possible for complex games to explore the whole branch so estimates are formed and pruning takes place. The pruning is what makes a pattern like this powerful. You can’t enumerate over a whole branch, so instead find a way to remove large portions of the branch at once.
Graphical Models
-------------------------------------------------------------------------------
I didn’t really understand this pattern. I guess it is about trying to capture some kind of information and then create a graph from it. From there, use properties of graphs to make decisions. I think that is what it is about. But the pattern is only three pages, and gives no example. So! …
Structured Grids
-------------------------------------------------------------------------------
So here we are, the last assigned pattern. It has been a long journey and I have over 40 pages of my notes written up into a word document. I’ve had some introduction to patterns like this before. Matrix calculations is very important when it comes to imaging, especially medical imaging. Still I haven’t tried using anything like this in practice. I can say though that I’ve felt that through reading these patterns I’ve learned a good deal about parallelism. What I’ve learned is that unless I’m doing really complex algorithms or time consuming I/O, I’m probably not going to use it much. I still feel that parallelism is going to be the future of computing, but I think most of us are only going to use it by proxy through some sort of library.
by Eric G. (noreply@blogger.com) at November 12, 2009 04:28 PM
Shared Queue
-------------------------------------------------------------------------------
This pattern is about how to use a queue in a parallel sense. I haven’t had to work with something like this before in practice. The only time I’ve had to look at a queue this way was when learning about parallelism and locks in other classes. To be honest I didn’t really find this pattern interesting. A lot of my code lately has been in Java, and because of that I’ve learned to lean on the extensive library that exists for Java. If I am going to need to handle a queue like this, I’d prefer to just use a predefined data type than trying to use a pattern like this and create my own.
Speculation
-------------------------------------------------------------------------------
I really don’t get why people are just linking to a pdf on the wiki. To be honest I’d much rather have the wiki be a wiki where you can make edits to it as needed. Anyway speculation. This is another pattern that I have already seen once before in the world of hardware. In hardware it is usually called something else like jump prediction. Where you predict if the code is going to make a jump in the instructions so you can have it loaded if/when it happens. Here we are looking at this problem with a software lens. The interesting thing about this pattern is how small do you want the tasks to be. If the tasks are too small it’ll make more predictions and potentially waste a lot of time. If the tasks are too big then you don’t get as much parallelism. This is a good example in which programmers should really study how the program is used and run to find a best solution, but I’m only speculating.
Circuits
-------------------------------------------------------------------------------
This is another pattern that I have little experience with the problem being solved. I haven’t really had to touch bit level operations, and what I have done was either in assembly or could be handled with simple shifts. Still having a pattern for this type of calculation makes sense. I’ve worked with people that do encryption with super computers and this is a bulk of what they do. Being able to speed up these calculations can really affect a program. This pattern seems to look at what my super computer friends usually do. That is find a problem, then see what weird manipulations you can do to solve it faster. Still I’m not sure if I’d call this a pattern.by Eric G. (noreply@blogger.com) at November 12, 2009 04:27 PM
Structured Grid Computational Pattern
This is a very common pattern that is very easy to get a mental image of. Structured grids model spaces in N dimensions as discrete points that interact. Their interaction can be regular or irregular (this pattern only talk about regular interactions) and in the regular case they are usually characterized by a fixed-size stencil.
We discussed it in class today and the biggest problem with this pattern is that it overlaps heavily with geometric decomposition. Indeed it actually describes geometric decomposition as the essence of the task manager described in this pattern's solution. First, geometric decomposition is about decomposing and solving geometric problems in parallel with the issues that bring and is not just a task manager. Second, if geometric decomposition was the essence of a subset of this solution then why would one have that pattern at all. A much more sensible strategy is for this pattern to discuss the problem/solution of performing structured grid computations, discuss stencils and then provide pointers to patterns that can be used to parallelize it. This could of course be geometric decomposition, but also master worker, etc. In particular this means that border exchanges belongs in geometric decomposition (which only mention them for about one paragraph. Indeed, as a part of cs527 I have actually written an entire pattern about just border exchanges. The current draft, with all its flaws, can be found here: URL.
Also, border exchanges as described in this pattern are insufficient to implement their first example. Even if the ghost nodes extend three rows/colums out from their chunks as they mention they still have to deal with the (literal) corner cases as their stencils will need cells from diagonal chunks. To see how this is usually handled refer to the border exchange pattern above.
Actually, todays discussion made the tight relationship between geometric decomposition and border exchanges clearer to me and I am wondering whether it would make sense to attempt to extend my border exchange pattern to be a geometric decomposition pattern or perhaps to try to merge them. It would give geometric decomposition a very specific meaning, more so perhaps than the OPL authors intended. It would essentially cover data-parallel, iterative refinement algorithms with border exchanges, but for me that is what geometric decomposition is. In patterns I value preciseness more than completeness. The original design patterns had this property. They did not give a complete design framework that is sufficiently vague to be complete like many textbooks attempt. Instead they gave crisp, specific and often non-intuiative solutions to concrete problems and for me it was this that gave them their greatest value. Besides I suspect other data-parallel algorithms with other types of communication, like the tiled N-Body example given in yesterdays OPL session, are covered by the Data Parallel pattern.
Branch & Bound Pattern
This pattern was pretty good. It started out with a decent example in the SAT decision problem and gave a good explanation of issues with branch and bound by authors who seemed to know what they were talking about very well. One area of potential improvement would be to add a code example. The paper also seems to implicitly conclude that structure-based parallelization is the one that makes sense, but if that is the case then why talk about the other two. Potentially the construction-based parallelization can give less synchronization overhead at the cost of more work, but I can't see a case where I would want operation-based parallelization. If there are such cases then the pattern should address those.
Another possible improvement that was suggested in class would be to start off with a chess or checkers example. The examples that are there are good, but I suspect many more developers will have some mental image of how a computer chess player searches for moves. It would at least be a more colorful and graphical example as the first example.
Graphical Models
When first reading the title of this pattern I must admit I was thinking of 3D meshes. We discussed it in class and other seemed to have other first-impression, but none seemed to match what it actually is about. Professor Johnson suggested that maybe it should be called s Probabilistic Methods or Stochastic Methods, both of which would have been preferable.
Even after reading the pattern I have no idea about how where it would apply or how I would start applying it. It lacks an example and I happen not to know the Junction Tree Algorithm. An example would have gone a long way, but even the pattern lacked pointers on how to go about parallelizing it. Other computation patterns has a parallel issues section and some, like circuits, are sprinkled with pointers to other patterns. This one had one line that said that "This search can be rather trivially parallelized" related to trying out randomized factorizations, which leads me to suspect that other parts of the patter can be parallelized. If not then wouldn't this just be a subset of branch and bound or perhaps something simpler?
by Fredrik Kjolstad (noreply@blogger.com) at November 12, 2009 03:50 PM
Shared Queue:
This paper dealt with how to design a shared queue, a data structure that can be used to better facilitate parallelism by allowing multiple processes to pull from a task queue concurrently. The paper went through some basic design steps by highlighting the necessity to define access to the queue and then deciding on concurrency control. Some code examples were included to better illustrate the design. The author also went into details such as blocking and non-blocking queues and specifics for their implementation.
by Randy Blamo (noreply@blogger.com) at November 12, 2009 11:28 AM
Branch and bound pattern involves dividing the problem into smaller sub-problems, using the problem's definition to set constraints that must be met by sub-problems and eliminating the sub problems that do not satisfy the constraints. This is all done with the four steps: branching, evaluation, bounding and pruning. The theoretical aspects of the pattern are very well explained, but more code examples would have been helpful for understanding.
I did not understand exactly how the Graphical Models pattern solves the problem statement. In many problems we use measurements that can be performed to infer things and make conclusions. This is usually done with some kind of probability distribution as used in this pattern. But the solution is not clear to me. It is unfortunate that they did not include any examples to illustrate the solution.
I think the Structural Grids pattern is basically about using geometric decomposition to decompose the problem into smaller chunks and have processing elements operate on separate chunks in parallel. It looks like this is applicable to image processing, but I did not fully understand the example presented.
by Patg (noreply@blogger.com) at November 12, 2009 08:57 AM
Shared Queue
I thought this pattern overall was pretty good and it covered most of the issues of queues. It also had quite extensive examples (taken from the ppp book). A couple of things I would have liked is a discussion on the use of notify vs. notifyAll and the concept of stampedes if too many threads are awakened at the same time as well as a mention of the pthread primitive they mention that provide wait/notify functionality (condition variables). On the other hand I liked how they gave reference to how to implement notification with semaphores (I have seen this been done once -- sometimes you just don't have access to libraries like full pthread for various reasons). I also think they could have dropped the thirds item in their solution ("Consider using distributed queues"). The way it was described on page 8 it is better covered by the task queue pattern. This pattern, as I understand it, is about implementing a queue. Not on laying one or more queues to share tasks, which is what the other pattern is about.
Speculation
I thought this pattern was really interesting and I am glad I have read it, but it also perfectly illustrates to me Professor Johnson's point about starting out with an example. I did not know this pattern beforehand and was left guessing about what it could be used for and how it worked for six pages before I finally got to the example section. In the mean time I was fed a lot of seemingly interesting facts, but not having an example to attach them to left me very frustrated. The html example should have been presented after the first paragraph of the context, but its full solution could have been addressed later in the solution section.
Of course, like Professor Johnson points out, this is due to the OPL format so I think the solution has to be to change it or forever live with the tension of trying to force an example into the start of a format that invites you to put them late. My suggestion is to simply change the name of the example section to additional examples. Not only would this invite writers to have one example in the context/solution part of the pattern, but it would also invite more than one example. The additional examples section could even contain more tricky examples, which are nice to get after one has been guided through the solution using a simple and obvious one.
A couple of minor suggestions are to add regexes in the HTML lexer example in addition to the textual descriptions (page 6). For me at least this would have been much easier to read. Also, I would have liked the another paragraph explaining how to use speculation with speech recognition. On a positive note I really liked how they contrasted speculation with branch and bound in the related pattern section, describing how they differ. This gave me a better understanding of both patterns.
Circuits
At first I thought the pattern was a concurrent execution pattern so I was very unhappy that it did not talk about parallelism in the problem and context. However, I am more content with it after I figured out it was a computational pattern.
The pattern deals with different techniques/algorithms for bit-level manipulation. What I can not quite figure out is why it is called circuits and not just bit manipulation. Sure, working with bits is working close to the Hardware, but the name circuits implies that the computational pattern is about modeling circuits or something similar. I had a look at the notes section for the this pattern on the OPL site and it talked about gates and synthesis, which is what I originally expected and which compounded my confusion. Has there been a mixup here and is perhaps this pattern incorrectly named?
Other than that the bulk of the pattern is in the example section that presents several algorithms for doing different bit manipulation tasks such as hashing, bit-counting and parity checking. Many of these can be parallelized and the pattern does a good job of giving pointers to other patterns that can be used to do so and I think this is the right way to do it for computational patterns.
by Fredrik Kjolstad (noreply@blogger.com) at November 12, 2009 08:49 AM
Branch-and-Bound Pattern
The pattern has 4 basic operations branching, evaluation, bounding, and pruning operations. The problem is split into sub problems like branches and this decomposition can be done using some other patterns like geometric decomposition pattern or so, and then these pieces can be parallelized and run on multiple cores. Each sub problem get evaluated and if it has a feasible solution then proceed it for the bounding and then to prune so that we will have optimal solution.
I like the flowchart and it’s very nicely described. I like the algorithmic style of presenting the paper as it is appropriate for this one.
Graphical Models Pattern
Looks like this pattern is still in process and didn’t give me a feel for it be a parallel pattern. It only mentioned how transformations and searching of trees can be parallelized. I am not sure if I understand the whole problem that author is trying to solve. Looking forward for class and then reread it again…
Structured Grids Pattern
This pattern explains how to parallelize operations on a large grid or array. The pattern uses geometric decomposition to decompose the problem into smaller arrays and then operations are performed in parallel in each array. Ghost nodes are used to figure out any dependencies of neighboring nodes.
I like the double buffering and adaptive mesh refinement idea s to refine the rendering. I have read some adaptive mesh refinements before in gaming to reduce the number of vertices and edges in a 3D model. Usually this is performed during game development and cached somewhere. Overall I like this paper batter than others I have read for this week’s class.
by Sunitha (noreply@blogger.com) at November 12, 2009 08:46 AM
I would like to say a pattern paper with more than 8-10 pages may be boring and it may not keep users undivided attention in reading it. So Prof. Johnson is clever asking us to write a pattern with 6 or so pages.
All these 3 pattern papers are too long to me and for last few pages I struggled to keep my attention. Concepts for patterns are clear but they may have done a better job keeping it simple and clear with good examples. Here is my brief review on these patterns and looking forward for good discussion on these though…
Shared Queue Pattern
Shared queues are well known in application development community. A centralized share queues are easier to implement. A well known use of shared queue is with a thread pool. This can be implemented as individual threads take tasks directly from the queue or a master thread distributes the tasks to the threads.
This paper used the examples with separate locks for push and pull but I think it is easier to use and maintain a single lock for the entire queue. Two locks might be more depending on the number threads at a time in the queue and maximum size of the queue.
Speculation Pattern
This speculation is similar to branch prediction that I learned in CS421. Given example is not very clear to me and I may get better understanding of this pattern with respect to parallel world and speculation.
Circuits Pattern
As I am learning about the time bounds that depend on the input size from the algorithms class (As algorithm teaches to be linear or polynomial in execution of the algorithm with respect to input size) this pattern’s problem is eye catching for me.
I like the example but I wish they showed it from high level language. I have to listen to class to get more insight in this one….
by Sunitha (noreply@blogger.com) at November 12, 2009 08:13 AM
Shared Queue:
What experience have you had with shared queues, and what factors did you use to decide whether to use single vs. multiple queues?
Shared queues are widely used in the system I develop at work. One of the share queues is an “I/O completion status queue” shared by multiple processors that process completed I/O operation. We use a single queue, because back in the days there were only 2 processors. Now with the increase number of processors sharing the queue we are getting scalability issues because of excess contention.
An enqueue or dequeue seems like a very small operation. Why is there so much concern over contention to the queue? What factors cause this?
They are basic operations on a sequential program, when the queue is accessed concurrently appropriate synchronization must be implemented
Seeing as how the authors lay out quite a bit of code to support the variations of these patterns, they seem to mitigate their own concern as to the value of "simple protocols". Why would someone opt for the less efficient implementation, given that the concerns are already outlined, and sample code available?
It depends on the requirements of the application. If performance is not a major concern, I would go with a simple implementation even if the more efficient, but complicated code already exists. In my work environment we have performance numbers that we need to meet and we have to meet them even if it means writing an ugly code as long as it performs good.
Speculation:
Speculation seems like a pretty straightforward pattern, though potentially a quite powerful one. What factors would you consider to decide if speculation is an appropriate/beneficial pattern to apply?
I agree this seems like a very powerful pattern. It uses the idea of hardware speculation implemented in almost all modern processors. The factor to consider here is the cost of wrong speculations and the probability of making such wrong speculations. If the cost of wrong speculation is minimal compared to the gain of good speculation, then this pattern can produce great results.
Seeing as how you may not have data about the statistical distribution of inputs to expect before you build the program, and therefore perhaps not know the average cost of rollback, etc., is it prudent to implement the speculation pattern at the outset, or wait to evolve the application to include speculative optimizations down the road? What if the performance increase that speculation has the potential to generate is necessary to meet the minimum requirements of the specification? Is speculation still a viable pattern to apply, or would effort be better spent on other parallel optimizations with more certain outcome?
Speculation is a very powerful technique and I think that, for applications with predictable inputs, it is worth investigating if speculation can help meet the requirements.
Circuits:
Was this pattern what you expected from the name?
No, but after realizing that it was about bitwise operations, the name make more sense because digital circuits deal with bitwise operations.
I am a low level firmware programmer and I deal with bits operations on a regular basis. The concepts explained in the patterns are widely used by low level programmers: Large inputs are divided into smaller blocks and the bitwise operations are applied to the blocks independently to perform the calculation.
by Patg (noreply@blogger.com) at November 12, 2009 07:46 AM
Branch and bound is a very useful pattern in many areas, and I think it has been discussed a lot in the previous patterns. The pattern involves branching, evaluation, bounding, and pruning operations. The problem is split into subproblems, maybe using the geometric decomposition pattern, and these can be parallelized and run on multiple CPUs. The pattern is well described and not too hard to understand. The parallelism that we can exploit is explained and the advantages/disadvantages are described. I would have liked to see some code examples, but that would probably take up many pages.
Structured Grids
Structured grids describes how to parallelize operations on a large array or grid. The pattern uses geometric decomposition to decompose the problem into smaller arrays and then operations are performed in parallel in each array. Ghost nodes were a concept that was introduced in geometric decomposition (I think) and it makes the pattern work when there are dependencies on the neighboring nodes. Double-buffering is a technique commonly used in graphics rendering, and it was interesting to see it being applied here. I have seen some adaptive mesh refinements before in gaming to reduce the number of vertices and edges in a 3D model. Usually this is performed during game development and cached somewhere. It is usually not a resource intensive task, but maybe for large models, it may be nice to see how the pattern can exploit the parallelism.
Graphical Models
Graphical models didn't seem like parallel pattern. It is only briefly mentioned how transformations and searching of trees can be parallelized. The pattern is about graphically representing the dependencies the variables in a probability equation. I really didn't see how that had anything to do with the parallelism. It definitely helps you understand the problem. I think this pattern could use the branch and bound pattern to search for solutions.
by Garrett (noreply@blogger.com) at November 12, 2009 07:25 AM

I know I’ve said that I like seeing code examples, and also that that seems to be the consensus from the blogs and from in class. However the example at the end was just too much to take in, at least for the level of attention I want to spend on this pattern right now. I’d probably appreciate it more if I was going to be implementing the shared queue myself and I’d need to understand it deeply, but right now I’m just reading the pattern to understand it to know when I’d use it.
The other examples, however, were great. Looking at the same shared queue problem and showing all the different locks and such to make it safely concurrent was very nicely done.
The dynamic generation of predicates interests me much more static. It frees the programmer from thinking about it beforehand. I’ve never done either, so I’m not sure what the cost of dynamic generation is vs. static, but it is more attractive.
Ah hah. I wrote that paragraph before getting to the examples section. Now that I saw the example, coming up with the predicate can be really obvious and straight-forward, as it was for the html lexer. Now I wish there was also an example that was complex enough to warrant dynamic predicates.
This pattern great. I like the ones that show me something completely new and which are obviously useful to me. The ability to add concurrency to inherently sequential problems I did find the examples great though, but that might be largely due to the fact I’ve done some html matching before (though not quite lexing), and am familiar with hidden markov models (they’re used in bioinformatics).
This was somewhat off-putting to me from the very start, since I’ve never done bit-level math/code. I think this caused me to not really give it a chance. I mean, I read all the words and such, and tried to understand the code examples, honest.
But, this just isn’t for me. I thought I’d like the MD5 example. I’ve used md5 a lot, and while now I turn to sha for my hashing needs, it seemed like an example I knew would do the trick. But as it turns out although I know about md5, I never tried to implement it myself, so I couldn’t get anything out of this example.
Engine Yard ran a contest over the summer to brute force sha1 hashes with the shortest hamming distance to a supplied hash. The winners got 130 of the 160 bits right and used gpus to do the work. It seems like this pattern would be good for this sort of problem. My favorite solution, though, was croudsorcing the work by doing it in javascript and asking people to visit the site with their browsers.
Shared Queue Pattern
The shared queue pattern is found quite often in software applications involving the processing of multiple tasks. A centralized share queue is definitely one of the easier implementations. A common use of shared queue is with a thread pool. One concept is to allow the individual threads pull tasks directly from the queue. Another option is for a master thread to take responsibility for distributing the tasks to the threads for load balancing.
Although the examples in the paper used separate locks for put and take, I believe it is more common to use a single lock for the entire queue. Two locks might be overkill depending on the number threads working off the queue and the max size of the queue itself.
Speculation
I do wish the HTML lexer example was earlier in the paper. After getting through most of the paper, I recalled learning about speculation with branch prediction in a previous class. With branch prediction one path is speculatively chosen based on the past behavior of the program. This can be extended to speculatively execute both branches and terminate one of them when the result of the branch predicate is available.
I do want to go back and review this pattern again after the class discussion. I suspect there will be informative information discussed to help me gain a better understanding of this pattern.
Circuits Pattern
I can honestly say I had no idea what the Circuits Pattern entailed when reading the title. Even though this is not a Physics class, I thought it might have something to do with electronic circuits.
It is unfortunate I have not been professionally exposed to bit shifting as I might have had a more genuine appreciation for the examples of limited information presented in this paper. The md5 example was familiar and somewhat intriguing to see how it is really implemented. Looks like this might be another pattern I will revisit after the class discussion.
by James (noreply@blogger.com) at November 12, 2009 04:58 AM
This time the blog is about three of the computational patterns in OPL: Branch-and-Bound, Graphical Models, and Structured Grids.
The Branch-and-Bound is intuitive from high-level. Since the search space of a problem is simply to grand to do an exhaustive search, we can only try a subset of it to get a "feel" of what the correct answer would be. Based on the subset, we derive the ranges that the target falls in and thus we can prune the space outside these ranges. Consequently, the solution of this pattern naturally falls into four steps: Branching, Evaluating, Bounding, and Pruning.
This pattern is grouped in the computational patterns category, however some portion of it extensively addresses parallel issues which makes it seems like a parallel algorithm strategy patterns. It might be more clear if the authors follow the examples of some other patterns in OPL and separate them into two patterns.
The Graphical Models is the one that I have trouble getting a firm grasp on. A high-level description of the pattern, such as "given the probability of occurrence of some events, one can use the model to infer probabilities of other events" seems to make sense but I found myself looking for an example to have a firm understanding. Unfortunately the Example section was omitted.
The structured grids pattern deals with data that can be represented nicely as a multi-dimensional arrays. Here the pattern description is not about how to represent a problem in structured grids rather it is about having efficient ways to decompose the grids into chucks. In this sense, it is strongly tied to the Geometric Decomposition pattern.
by Anhoe (noreply@blogger.com) at November 12, 2009 12:45 AM
Shared Queue
This implementation pattern was presented well enough. The presentation was clear and well-developed, with clean and well-documented code examples. I get it; the queue is a collection framework for tasks generated by some mechanism (decomposition, etc.), and how it is able to provide additional benefits based on the particular implementation details (scheduling, load-balancing, etc.). What I’m unclear about is exactly how this pattern differs from the Task Queue pattern.
The Task Queue pattern attempts to differentiate itself by classifying the Shared Queue, ”…using the queue data structure to manage data instead of tasks, please refer to Shared Queue pattern.” The Shared Queue pattern attempts distance simply through a mention within its own context section, “… a task queue is commonly used as a scheduling and load‐balancing framework, which implements the Master/Worker pattern.” The Task Queue at least had the respect to capitalize Shared Queue in its mention.
Scheduling and load-balancing, simplicity of implementation, Master/Worker structure; these are the issues that these pattern speak to. The Shared Queue states forces of simple protocols vs. increased concurrency, and the question of single, multiple, or distributed queues. The Task Queue states forces of centralized vs. decentralized implementation, and eager vs. lazy scheduling. Let’s see where the differences are…
Simple Protocols vs. Increased Concurrency
Task Queue: “Centralized task queues are simpler to implement, however… Distributed task queues… are able to scale to much larger number of processing elements (implying performance).” “It is more difficult to enforce global ordering… using a distributed queue… extra synchronization is needed.”
Shared Queue: I will grant that when this pattern discusses protocols, it is speaking directly to synchronization of the implementation, at the level of the code. When Task Queue speaks to synchronization, it is at the level of the implementation of the data structure. Again, this is a somewhat fine line of demarcation; “should we salt the soup, or salt the stock that we will ultimately put in the soup.”
Scheduling and Load-Balancing
Shared Queue, “When a thread generates a new task, it is put in its own task queue. When a thread executes a task, it first tries to take one from its own queue. If it is empty, it randomly chooses another thread and tries to steal one from that threads’ queue.”
With Task Queue, “each processor has its own queue, it enqueues its local tasks to this queue… When a processor finishes executing, it looks for a new task in its queue. If there is not a new task in its own queue the processor steals a task from another processor.”
Master/Worker structure
Shared Queue: One of the examples in the distributed queue section is “a queue to hold tasks in a Master/Worker algorithm.” It then goes on to state in the Related Pattern section, “Shared Queue pattern is often used to represent the task queues in algorithms that use the Master/Worker pattern.” That is very handy, especially since it seems to accomplish that quite well all by its self.
And furthermore, if the Shared Queue is used to implement the Task Queue, and the Shared Queue is so adept at handling tasks and “data” (Abstract Data Type; Duh… yea, a queue), wouldn’t a Task Queue be just as capable of handling the same “data”?
The question may be, when a queue is used to queue tasks (or “data”), is the queue structure following the Task Queue pattern or the Shared Queue pattern. They seem to be working together to separate themselves. If you listen to the two patterns it would sound like a queue structure implemented through the Task Queue pattern, which was implemented through the Shared Queue pattern. That’s like me saying that my car is an implementation of the automobile pattern, which was implemented through the private-passenger vehicle pattern. Their just too close to distinguish.
What experience have you had with shared queues, and what factors did you use to decide whether to use single vs. multiple queues?
There have been many times I have implemented a buffer. A good portion of those times the implementation was a double-buffered implementation.
An enqueue or dequeue seems like a very small operation. Why is there so much concern over contention to the queue? What factors cause this?
While the operations seem somewhat atomic to the mind’s eye; there are actually a good number of pointers being moved around behind the scenes. These can easily get corrupted in the event of a collision.
Seeing as how the authors lay out quite a bit of code to support the variations of these patterns, they seem to mitigate their own concern as to the value of "simple protocols". Why would someone opt for the less efficient implementation, given that the concerns are already outlined, and sample code available?
Necessity is the mother of conviction, as well as invention. The situation dictates with an application, especially when it comes to performance requirements. I have been forced to write some unbelievably ugly (almost unreadable) update statements in SQL; because while the nested cursors would take upwards of 2 hours to complete, the update statement would finish in just under 15 minutes. When there are about 20-25 of these required to “roll-up and archive” multiple levels of summary data in a 24hr period, it doesn’t take a minor in math to get the picture. Complexity should be our little friend.
Speculation
I do not see this paper as being a viable pattern; in that the premise for its implementation is not straight-forward and applicable in a general sense, it is seriously flawed in its depiction, and as presented speaks as much to its inability as its ability to serve in its intended capacity.
Regardless, in this instance the problem statement is terse; and does not give a complete scope of the issue. In addition the context seem a bit short-sighted, “The main insight to parallelize these problems” is to look at ways of breaking “long chains of sequential dependencies.”
The pattern speaks many times of dependencies; “for a subset of the tasks unpredictable dependencies emerge”, “In many problems, there are long chains of sequential dependencies”, “it might so happen that the dependent chains are not really dependent”, “break the dependency chain by ignoring dependencies)”, “we can break long dependency chains by ignoring them”, “ignoring these dependencies may not always hold good for all inputs.” In each instances spoken of by the pattern, the dependency is pointing to a “predicate.”
This predicate is a measure of correctness; and nowhere in the paper does it mention consistency. There are executions performed speculatively, with complete disregard for most if not all dependencies of the task executed, and in not one place is there mention of the possibility of a concurrently running thread having a data collision with another. Nowhere is there any consideration for transactional consistency, serial equivalence, or isolation integrity. These are the basic, the absolute foundational considerations for committing speculatively executed results.
The paper states, “If we use the task queue implementation pattern, information needs to be given to the task queue, so that locality is exploited. Executing tasks that are localized in the dependency graph leads to less wasted work and better efficiency. Executing tasks that are close in the dependency graph provides the benefits of early validity detection, thereby saving work if the predicates turn out to be invalid.” This is technically correct in general; but completely absurd with respect to speculative execution. It is specifically the tight proximity of tasks on the dependency graph, executing concurrently, that lead to data collisions; and ultimately the invalidation of speculation. This causes tasks to be rescheduled and rerun; and hence this above anything else. would lead to the complete eclipse of the concurrent advantage.
Is it assumed that some synchronization is required? Are we, the readers left to imagine the possible consequences of such a radical approach to parallelism? While I will acquiesce that the paper does a somewhat adequate job of introducing the concept of speculative execution
Please pardon the vociferous nature of this diatribe (yes, I have been told since I was 12 years old that I use too many words, and too many of those words are too large). In full disclosure, I have taken on the opportunity to produce a Speculative Execution Pattern as my contribution this semester. In any event, this is just my opinion; I could be wrong.
Speculation seems like a pretty straightforward pattern, though potentially a quite powerful one. What factors would you consider to decide if speculation is an appropriate/beneficial pattern to apply?
The primary factors to consider in approaching this pattern are:
1) Tasks that have a consistently high read to write ratio.
2) There exists an exceptionally large task pool for concurrency.
3) There exists a large number of processing resources (CPUs).
4) The processing space implements effective message-passing.
The first is somewhat self-explanatory. Two and three are a result of the speculative nature of no locking. Four is related again to the absence of locking; but more in that shared memory is not used as a means of communication.
Seeing as how you may not have data about the statistical distribution of inputs to expect before you build the program, and therefore perhaps not know the average cost of rollback, etc., is it prudent to implement the speculation pattern at the outset, or wait to evolve the application to include speculative optimizations down the road? What if the performance increase that speculation has the potential to generate is necessary to meet the minimum requirements of the specification? Is speculation still a viable pattern to apply, or would effort be better spent on other parallel optimizations with more certain outcome?
The Speculative pattern in general has a rather specific niche, and a narrow comfort zone within that niche; but it is the pattern’s overwhelming ability within that scope that makes it appealing. If there is a venue as described above, it would be well worth the relative minor effort to experiment with speculation.
Circuits
This paper was in many ways, exceptional. I was actually captivated by this pattern; first by its lack of correlation to my expectations, and then by the Bit-Counting example. If you ask yourself, would you rather receive a single $1Million dollar transaction, or a tenth-penny on every global transaction during a year? It is clear to see that great things can come from the smallest and most overlooked spaces; maybe my mind has a chance after all.
The examples were amazingly clear and well presented. The build-up (limited to page one) was a bit short; I think the scope illustrated by the examples outweighs the terse introduction. It may be questionable if this is actually a pattern, or a newly defined framework-space for a family of future patterns. In any regard, my take-away was that things we do every day, without giving them a thought as to why we are doing them, are possibly an overlooked pattern-rich activities collection that has great value. I can see that there are potentially thousands of off-the-top-of-the-head, sub-conscious decisions made daily, that are truly rich in decision-making activity; but allude our conscious detection, and are therefore not in our available application arsenal. If we could apply the full scope of mental activity required driving a mile while talking on the phone, to concurrency; all computers would have 32-core processors, and I wouldn’t have to stand in line for an hour to get my license renewed.
Was this pattern what you expected from the name?
I think, not at all. (I meant to do that…)
It seems like a great deal of this pattern is described through examples of clever transformations for specific problems. Do these examples give you enough information to go about applying this pattern to a problem dissimilar to the given examples? Can you think of a way of describing these transformations at a more abstract yet actionable level than the examples do?
This is a very interesting question, even more so in that it is the first question posed that I was previously considering. My answer to the direct question about the examples would be to say no; the examples express how broad a pattern-space this could be. I postulate a much larger family of micro-level concurrency patterns; and that they could possibly account for a surprising volume of concurrency.
The examples seem to radiate outward in many directions; and not focus towards a pattern path. Even with the divergent character of the pattern itself; it is very descriptive in its effort and captivating in its essence, and I think it would be very possible to apply the basic character principles conveyed here.
~ JLSjr
by JLSjr (noreply@blogger.com) at November 12, 2009 12:08 AM
We use Branch & Bound Pattern when we have an extremely large space which we need to search to make a decision or to find an optimal solution. The space is so large that enumerating every point in the space is infeasible. The branch-and-bound pattern has four basic operations. The solution involves four steps: Branching, Evaluation, Bounding and Pruning. Branching is when a subproblem p cannot be solved directly and consequently it is decompose into smaller subproblems. Evaluation is the application of the objective function to a point in the search space. Here the reader is assumed to have familiarity with optimization problems. For instance, he should know that an objective function is a function associated with an optimization problem which determines how good a solution is, for instance, the total cost of edges in a solution to a traveling salesman problem. In the Bounding step, we should find a way to derive upper and lower bounds on the solutions contained in the space by solving a simplified version of the search problem. In the Pruning step, if a subproblem has been solved, or it can be proved that the subproblem contains no points better than the current optimal solution, or the subproblem is infeasible, we should eliminate the subproblem. The issues involved with parallelization seem to be well explained. The three main approaches to parallelize the branch-and-bound algorithm are: Operation-based parallelization (e.g. apply the same operation on several subproblems in parallel), Structure-based parallelization (solve different subproblems in parallel) and Construction-based parallelization (construct several branch-and-bound trees in parallel).
Although less than three pages long, I found GraphicalModels hard to understand. The problem is not clearly enunciated, some sections are very vague and the example section is missing which makes hard to understand. I preferred to google for other versions and try to understand more from there.
In my opinion Structured Grid is very well written. With limited knowledge in grid computation I was able to understand most of it in the read. Structured Grid offers a scalable solution for efficient updating of a structured grid, where updates are local in the physical space. Besides a few well explained examples, the pattern clearly explains all the domain specific terms (e.g. ghost nodes, spencils). In a structured grid problem, the location of nodes in the physical space directly maps to their location in data array (this is in contrast to the Unstructured Grid where the location of the element in the array cannot be directly inferred from the physical location). The updates are logically concurrent, but in practice are implemented as a sequential sweep through the computational domain. The most important aspect of solving a problem with a structured grid pattern is the domain decomposition. To decompose the problem into tasks, the grid is sliced into contiguous chunks, and each processing element is assigned a chunk for work. The chunks should be of small enough granularity in order to exploit parallelism and of reasonable size in order to fully exploit the computational resources of each processing element. The forces governing the solution are the typical task granularity vs. parallelism, simple task partitioning (e.g. ignoring unequal workloads) vs. fault tolerance.
by zamolx3s (noreply@blogger.com) at November 11, 2009 10:33 PM
Debugging tools have come a long way--who would have thought that with a push of a button you can step through code in different languages running across multiple machines in a seamless environment. These tools are invaluable, but what do you do before you have a bug within a configured environment? Enter the nasty no-repro bug. These are the gremlins that all software developers dread--an issue either reported or confirmed, the detail of which we don't know. How do we handle this? If the error is reproducable in a certain environment, we might have the luxury of instrumenting the code temporarily, and incurring the high overhead associated, or we can just start guessing (see Speculation pattern ;-).
Uniprocessor debugging is hard because issues can be sourced across space (code) and time. On a multiprocessor system, we introduce additional dimensions in which errors can occur, so we end up with something that feels intuitively like geometric growth of debugging complexity.
A tool like PRES is a welcome addition to the developer's troubleshooting arsenal. I agree that bugs don't *need* to be reproducible the very first time in the lab, but especially if the replays are entirely automated, a small number of replays can easily be tolerated (and really, what other choice do you have if you can't withstand the overhead of exhaustive logging?).
Sources of nondeterminism that make this whole process difficult can be somewhat hard to reproduce, due to some of them being generated by low-level constructs like interrupts. Virtual machine technology can help alleviate some of this by virtualizing things that were once relegated to pure hardware-controlled method, with limited possibility to control--now a tool like PRES could decide when things like interrupts might be generated.
by Dan Orchard (noreply@blogger.com) at November 11, 2009 09:10 PM
There are slides of Barbara Liskov's talk about work leading to her Turing Award. They give the complete references to the papers that she said had a big impact on her work.
A closely related talk was the Onward! essay by William Cook, On Understanding Data Abstraction. His basic point is that abstract data types and objects are different ways to achieve data abstraction. An ADT has a set of functions that are able to access a data type. The details of the data type are hidden from the programmer, but the functions are well known and the compiler knows the representation of the data type. An object is a function that returns a set of functions that can access the data. The details of the functions are hidden from the programmer, and the compiler has no idea about the representation of the data. OO programs are much more extensible because you can replace one object with another at run-time, i.e. each time through a loop a call to an object's method can result in a different function being invoked. However, you can't do this with an ADT, because the functions of an ADT require a particular representation.
William's point is that this is more important that identity or inheritance. You can add identity and inheritance to ADTs and you do not get nearly as big a change as you get when you add "message sending"/"virtual functions"/"polymorphism by late-binding of function calls".
by Ralph Johnson (johnson@cs.uiuc.edu) at November 11, 2009 08:47 PM
I was ill therefore not very productive at the start of this week so this blog comes a bit late, but better late than never.
Task Queue
I was pretty happy with this pattern and felt it had enough examples for my taste. It did a decent job of describing both shared queues as well as a high level overview of work stealing although it was perhaps a bit light on the latter. The only thing I missed from the example section was in the Java example. When implementing task queues in shared memory programs one usually use a condition variable to put threads to sleep instead of spinning waiting for the queue to return non-null values, but I guess this is mentioned in the shared queue pattern.
Graph Partitioning
I was pretty unhappy with this pattern as it was hard to understand what they were talking about. Like Professor Johnson I get the impression that the pattern is really about how to map tasks onto processing elements at startup time for load balancing and reduced communication. However, this was far from clear from the start of the paper which quickly delved into the details. I suppose it could also be used to remap tasks at runtime if the graph changes. I read a pattern that sounded similar to me from paraplop09 a while back called "A Pattern Language for Topology Aware Mapping.". This pattern language talked about how to map a graph of tasks and the communication between these efficiently onto a machine topology graph so as to effectively use the machine's resources: "Given an object communication graph and a processor topology graph, how can we map the first graph onto the second such that most messages travel a small number of hops (links) on the network" In that paper the intent was much clearer and if these two patterns are talking about the same thing I prefer that version.
Loop Parallelism
Loop parallelism is a pretty common pattern for parallelization on shared memory machines besides fork-join. I thought this paper did a fairly good job of describing it and went through a lot of different techniques one can use to prepare loops for parallelization. However, unless one are already familiar with these techniques then one would probably not get them after reading this pattern as the descriptions were very brief (they did contain examples though!)
There were a few minor issues that I thought were strange. For example, why doesn't the i and j loops carry dependencies in the example on the top of page 3? It looks to me like they do (data created in iteration i of the i loop requires data from iteration i-1 and i+1) unless they employ double buffering in which case that should be mentioned.
Another problem I had was that they refer to the map-reduce pattern for descriptions of reduce. Surely a pattern on loop parallelism is the right place to include a description of both reduce and scans? Reduces are certainly used by map-reduce, but that is a very high level pattern and I think a more detailed description fits here.
by Fredrik Kjolstad (noreply@blogger.com) at November 11, 2009 08:24 PM
It is nice to see some discussion of graphical processing, and it makes sense then that what we would see is a whole-picture processing by effect on the image. Also, what's with that particular image being some sort of academic standard. It seems... sleazy.
Anyway. I find it strange that the Structural Grid pattern (perfect for that sort of application as is indicated in the paper) starts out presenting itself as though it is going to be a discussion of a multidimensional version of the multicore array we discussed earlier and then begins to try to phrase itself as some sort of master conductor unit when it is clearly exactly a tool to help apply basic operations over multiple dimensions quickly. Multiple threads are of course going to be needed for that to work, and the data structure should be the one to dispatch but I kept reading the phrasing as though it was overstating what was actually being accomplished. The discussion on the ghost nodes would have benefited from a little more depth and a few more examples. Other than that a good description of what was necessary.
I liked the amount of coverage of the different approach styles in the Branch & Bound paper and the description of the pros and cons of each. I also appreciated a concrete example of when the multiqueue structure would be used to gain efficiency, and the fact that the queue technique in the previous paper was used directly. It seems to me that using the branch and bound technique to drive statistical models (like the problems Monte-Carlo is being used for in most examples) would be a viable route as we can limit the total number of runs using this structure and then collapse the results to find the actual best and the conditions under which it occurs.
by Will Vesely (noreply@blogger.com) at November 11, 2009 08:07 PM
Shared Queues
This pattern covers the common practice of using a shared queue in concurrent program. The authors talk about the major decisions being a centralized vs decentralized queue and simple protocols vs. more concurrency.
I have implemented a shared queue, where a pool of threads pulled tasks out of a shared queue. Since the accessing the queue represented a fractional amount of the task, it used what the authors call a simple protocol, using a mutex to protect the data structure. It was also centralized, since the application was running on a single system. If the program had spent more time in the dequeue or queueing operations, I may have chose to implement in a way that allowed greater concurrency.
Speculation
I found this pattern rather neat, as I hadn't thought of applying the idea which I was aware of to my software before. Rather than worrying a serialized task must run serially, one can still divide it up, detect if the result is correct, then rerun the parts that are incorrect serially.
It seems to me that problems that could fall under this are not too common as it needs several preconditions. First, the tasks must be long enough to to warrant the overhead and dividing up of speculations. Second, the task must be at least potentially dividable. The example given of parsing HTML was great as the page can be divided up. However, if the each task truly requires a previous task to complete, it will need rerun anyway, unless the result of the previous task can be guessed.
Circuits
This pattern was not what I expected when I read the name. Instead of being about digital circuits, it was really on how to parallelize bitwise operations over large blocks. A help in some contexts, but not one I'd tend to see in my work.
by awana81 (noreply@blogger.com) at November 11, 2009 07:39 PM
Branch and Bound
by 慈 (noreply@blogger.com) at November 11, 2009 06:55 PM
Shared Queue:
by 慈 (noreply@blogger.com) at November 11, 2009 06:21 PM
Shared Queue:
I suppose this is a sign that I haven't really been thinking things through, because we've had several patterns discussed already that talked about using a queue to create multiple tasks or threads with the option to use task stealing. And the idea of concurrency problems happening at the dequeue and enqueue operations scares me to death. Multiple counts of the same tasks being created, multiples getting added to the queue, and the list goes on. That we have waited until now to discuss this issue is worrisome, but what's scarier is that those papers we read before seemed to be willing to sweep these possibilities under the rug. And the deep problems at the task generation and retrieval stage can cause such extreme problems with data and process integrity.
Speculation:
It's a strange but interesting idea. I'm much more used to stages of programs flowing into each other in a way of tight dependency. I can see the possible uses for this showing up as a part of setup of methods and difficulty during other stages of execution. Additionally, decomposing the various tasks of interest into the FSMs for easy charting is going to be complex. The parsing example is a very good one, one where multiple stages can be broken cleanly into non-dependent parts, but getting another example showing where it could be useful would help easy my mind. Also, and this is probably just my lack of understanding, but I'm still not clear on the recovery process when a decomposed thing doesn't execute correctly. It's nice that this has been considered and covered I'm still confused as to how it's actually going to do its job.
Circuits:
Strange that this paper is almost 90% examples with very little definition. While I understand that there are only so many bitwise operations, seeing the examples of each with a clear explanation would be nice because I've only got a vague understanding of this stuff because it has been a while since I've last looked at it. Other than that, it seems sensible, though I personally haven't really worked with bitmasks there are very few tools to do so easily, and so doing CRCs on large disc images cannot be an easy task.
by Will Vesely (noreply@blogger.com) at November 11, 2009 06:19 PM
The Shared Queue pattern discusses how to share a queue amongs threads without race conditions. Threads that access the queue may access them in a blocking or non-blocking manner. Blocking is good if you expect the queue to fill up with more data eventually, so the worker thread can just sit and wait. Non blocking is good if you want to return immediately whether there are tasks or not in the queue, which can lead to better performance. The examples contain a lot of Java code and expects you to understand how the synchronized, notifyAll, and wait functions work. It builds a blocking and non-blocking Queue class iteratively, so it was easy to follow along and understand the necessary steps to build such a Queue in Java. Fork/join is mentioned as an application of this pattern, and how worker threads may steal tasks off the tail of other thread queues. The pattern was easy to read, but I think the author put too much emphasis on the Java examples.
Speculation is an interesting parallel pattern. It involves a thread speculating on the result of dependent tasks running in parallel. If the end results are correct, then we are happy since gained performance, but if the results were wrong, we have to recompute it, which is bad for performance. The pattern involves understanding the dependencies in the tasks, breaking them down, seeing which dependencies can be removed or reduced amongst tasks. Once the tasks are broken down to a granular enough level, we can run them in a parallel framework, such as master/worker or fork/join. The difficult part is that after a task is complete, we have to check whether the results were correct. If they are not correct, then we can either run the tasks sequentially or try running again with different input parameters. I think this thread is useful for problems with little dependencies, or algorithms where the results of dependencies can be easily guessed.
Circuits pattern exploits parallelism in bitwise algorithms or applications by operating at the bit level. It didn't have anything to do with circuits, at least none that I could read from the paper. The idea behind this paper is breaking the input into equal width blocks, depending on the architecture of the hardware for performance reasons. I think there was too much focus on the examples and not enough on the pattern itself. They talked a lot about clever transformations on the bitwise operations, but those things shouldn't be the focus of the pattern. I would have liked the authors to explain step 2 a little better in their solution, but otherwise the pattern was easy to follow and understand.
by Garrett (noreply@blogger.com) at November 11, 2009 05:01 PM
This blog is about the Shared Queue, Speculation, and Digital Circuits patterns in OPL.
The Shared Queue pattern is one those patterns having a dedicated entity that is responsible for distribution. In some patterns it distributes worker threads or units of execution to tasks; in contrast this pattern distributes tasks to units of execution.
Having a central queue is the most straight forward and conceptually simple implementation of this pattern. However, such a centralized structure soon becomes the performance bottleneck. Timothy Mattson and Youngmin Yi propose using distributed shared queues as the solution for the bottleneck. However, this has its own disadvantages such as more complex logic and difficult to enforce ordering. As mentioned in the Task Queue pattern, when going with distributed queues solution one can set up local queues for local threads containing local tasks to take advantage of spacial locality.
One comment on the structure of this pattern: the solution section seems to be an amalgam of solution, forces, implementation, and example sections and it is quite long. It would be more crisp if the authors separate them out.
Speculation is one of the pattern that I have not used. On the other hand, it is a very common technique in modern processors and so the concept is definitely not foreign. I tried to think why I have never used it and I believe it has to do with the difficulties in implementing the consolidation/ recovery phase. I also noticed in this pattern the degree of speculation is limited to one (a speculative task does not trigger execution of more speculative tasks) else it gets very hard to track.
As for the Digital Circuits pattern, it seems to be about packing bits into words and perform bit-wise operation in parallel due to the fact that most of the instruction set architecture has a data granularity of a word. If so, this is how I always do it, especially since C-like languagess do support bitwise operators. In this sense, this pattern is pretty much universal.
by Anhoe (noreply@blogger.com) at November 11, 2009 01:24 AM
Loop Parallelism
This one brought back even more vivid remembrances of projects gone by. I had forgotten just how simple, and yet powerful these techniques can be. While the data locality issues are mentioned here, I think they are downplayed a bit; especially when considering the proper positioning of data can have almost as large an impact as the loop manipulation.
This was illuminated during this same timeframe. This is silly, and many will say “how is that possible, I would have seen that immediately. Etc. Etc. Etc.” Well, my project involved bitmapped files; which by default have least-significant bit precedence (who thought of this?). Well I was focused so much on the loop modification issues of this project, that I did not consider the dramatic expense I was paying for byte-swapping; just so the numbers would look right as I output them. This was silly, right (I know, silly is not the word many are thinking of right now!). But, it was an enlightening experience for me, in that it is so easy to be blinded to the rather simple aspects of something when you are neck-deep in the complexities.
1. The author says the goal of incremental parallelization is to "evolve" a sequential program into a parallel program. If you have experience in refactoring code from sequential to parallel, how did your approach differ from the steps presented in the paper?
I mentioned a few papers back regarding a project in a CompOrg course. These were exactly the type of manipulations required to accomplish that project. This project was not a refactoring project, per se; however, as was mentioned in the paper, I did refactor in the sense that the loop strategies evolved. I used several of the techniques mentioned in this paper; peeling, unrolling, and interchange were three of the major ones.
2. Many of us have used Eclipse or another IDE, with their respective plugins, for execution profiling. What has been your experience with these tools? What is your preferred method of locating the code in question (automatic or manual)?
Considering the type of code location referred to here, I would have to say I have used a more manual method in the past. I have used the IDE based profiling more for test-case scenario identification, and help identify the tracing paths through the code for better debugging purposes.
3. The author states, "The application of many transformations to many loops may be necessary before efficient performance is achieved." What is your opinion on this statement?
With respect to “many transformations” and “many loops”, the second is straight-forward. If 90%+ of the loops in a program will be subject to scrutiny, “many” would usually apply. The first “many” is also a given, in that for each of the “many loops” differing combinations of techniques very well may produce different dynamics of performance. Memory, as mentioned, is a definite consideration in performance at this level. The change in the structure of processing a loop in one location of code could have a powerful affect on the memory characteristics available to another section of code while it is processed. As the strategy applied to one loop in a portion of the code is modified, the basic memory profile dynamics change for a section previously refactored.
Task Queue Implementation Pattern
The Task Queue Pattern initially looks good. It’s clear, clean, and easily recognizable. However, as I start to think about this pattern, it very well could have some possible short-comings. This is an implementation pattern, and as such relies on another pattern to supply it with tasks; if I’m not mistaken, the paper mentions load-balancing as an inherent bonus. While this is great, this pattern potentially adds a great deal of overhead to the works. The decomposition pattern would probably have locks, this pattern has locks, and the execution of the task may have parallelism overhead as well.
The _get(bool) function had mention of its locking mechanism; but, the _put() operation makes no mention of any locking. And furthermore, with the distributed version, the getFromTail() function also does not mention locking. I am not knowledgeable in Python, at all; but it would seem the _put() in a shared space could have contention. Even the getFromTail() operation, while in a dedicated space, is open to “stealing” from other spaces; and therefore could experience contention as well.
There is also no mention of how the blocked executions would ever get released. I’m sure there are no issues here that could not be resolved. Regardless, I like the queue structure; and it has been used forever for this kind of scenario (CPUs, routers, etc.).
1. How would you compare the Task Queue pattern to other similar distribution patterns like divide-and-conquer and fork/join pattern?
I’m not sure about “divide-and-conquer” as it could be implemented in a number of ways. Fork/Join on the other hand should make for a great comparison. Fork/Join has its own overhead issues; but when considering there is no demand for any locking, and the built-in call-back capability put it in a somewhat attractive position.
2. It was nice to see a pattern with some actual code examples. How important were the examples in understanding the concepts presented in this paper?
It may have been the clarity of “code” that led to my looking for particulars. This is almost a poster-child for the use of abstract pseudo-code for this type of demonstration. I snippets of working production code are used, that’s one thing; but, to present code in this way may be a hindrance.
Graph Partitioning Implementation Strategy Pattern
Yes, very math oriented; but that’s not all bad. I was better impressed with the treatment of the graph algorithms in this paper than the previous one, in that these were presented more in scope. I had never been acquainted with the KL algorithm, and it was interesting to see it in action; I can see its usefulness, especially in the blending of simultaneous analysis and decomposition of graphs.
In that direction, it was mentioned that hypergraphs and others could display quantitative values upon the elements of the graph to express certain dimensions. One thought I had as I was reading was, when using the KL process on hypergraphs, identify the most likely path to success (or expected path). In this case, as the graph grew large, the success quotients would allow for a pruning of the graph based on a given minimum success value.
1. This was probably one of the more math-oriented patterns I have seen and/or read. How often have you experienced a situation where the graph partition strategy would have been useful?
I have written programs to analyze surjective and bijective graph structures, mapped from referential integrity models in our database. Not rocket science, but interesting.
2. If never, can you think of any future situations where this pattern might be come in handy and could be applied?
It would seem to lend itself well to the analysis of social networks that are growing in interest lately. Of course, network traffic and latency studies are fundamental in this space. The graph patterns are applicable to everything from algorithms to zymurgy; well maybe you can’t brew beer with them, but they are very flexible.
~ JLSjr
by JLSjr (noreply@blogger.com) at November 11, 2009 12:59 AM
Shared queue: a queue is very common data structure that can be shared among multiple units of execution in a parallel program. Its implementation should tradeoff between simple synchronization protocols and increased concurrency. The easiest approach employs a single synchronization construct, however such a protocol most likely will encompass too much of the shared queue in that single synchronization construct, thus increasing the chances that UEs will remain blocked waiting to access the queue and will limit available concurrency. Fine tuning can increase the concurrency by providing concurrent access to multiple parts of the queue however this approach tends to be more complicated and thus error prone. On machines where the memory hierarchy of the underlying hardware and the number of UEs are limited, a distributed queue can avoid contention and decrease the parallel overhead. One idea of implementation is the distributed queue used in fork/join. Here each thread from the thread pool is associated a non-blocking distributed queue. When a thread generates a new task, it is put in its own task queue. When a thread executes a task, it first tries to take one from its own queue. If it is empty, it randomly chooses another thread and tries to steal one from that threads’ queue. As each task tries to take a new task first from its own task and there is no single hot spot, it eliminates the performance degradation.
I found speculation quite an interesting pattern. I was especially familiar with hardware speculation and particularly with branch predication, which is present in almost all major desktop CPUs. With branch predication, for each branch instruction, one path is speculatively chosen based on the past behavior of the program. This can be extended to speculatively execute both branches and terminate one of them when the result of the branch predicate is available. Speculation as a parallel algorithmic strategy is similar but it is usually at a much higher granularity. The main assumption in the speculation algorithmic strategy pattern is that we can break long dependency chains by ignoring them. Creating independent tasks is then a speculative action as ignoring these dependencies may not always hold good for all inputs. In such cases, we might not be much better off than a serial algorithm, but on average we will be able to take advantage of the available hardware parallelism. Predicates are constructs that are used to keep track of what needs to be true for the decomposition to hold. Predicate validity check is a very important aspect of this pattern. A quick way of checking the predicate validity is essential. A last step of the algorithm is having a commit/recovery mechanism depending on whether the predicate was true/false respectively.
The link assigned to the third pattern reads Digital circuits, while the description attached is actually for the Circuits pattern. Even though related the two are still different patterns. Digital circuits offers solutions for implementing an hardware architecture instruction set that meets the performance and power requirements. The circuits pattern applies to problems where the output is a logical function or bitwise calculation of the input. Since most of the instruction sets are restricted to operate on a word size of data, the pattern provides a way to efficiently compute the output. Digital circuits deals with bitwise operations as well as the arithmetic operations (ADD, SUBTRACT, ALU) on bits or bit-vectors. Therefore, it is a natural fit for the circuits pattern from the view of algorithmic strategies.
by zamolx3s (noreply@blogger.com) at November 10, 2009 11:58 PM
The three patterns in this set are Branch and Bound, Graphical Models and Structured Grids. First, we can talk about Branch and Bound. Amongst the four operations of branching, evaluation, bounding and pruning, I feel that the most important and valuable one would be the application of the objective function to a point in the search space. Evaluation becomes harder for larger search spaces. Pruning a particular branch using this strategy becomes pretty simple as it is quite easy to determine when the minimum value of a linear integer programming problem on a branch is greater than or equal to the current minimum possible solution. I feel that this kind of optimization is also very similar to the Monte Carlo pattern that we studied earlier in the class. The flowchart given in the description demonstrates this pattern very well. The second pattern is regarding Graphical models. Graphical Models is the next pattern in this set. This doesn't have much of a description though. In addition to the examples mentioned, I would like to add Bayesian inference as one of the possible algorithms using this pattern. Code snippets would have been helpful in understanding this pattern for people not familiar to the mechanisms described in the paper. The last pattern is about Structured Grids. We can observe that this is one among many patterns which are applicable to problems in image processing and in computer vision. It is mentioned in the context that the problems which follow this pattern perform updates until a particular error threshold is reached. It seems to me as this is closely linked to the granularity of the tasks to be performed. Overall, these set of patterns were more closely linked to problems that I could relate to in the real world , and were a very interesting read.
by Chandra (noreply@blogger.com) at November 10, 2009 09:01 PM
- What do you expect from a debugging tool?
+breakpoints
+stepping through the code line by line examining which branches are taken
+examining the state of variables
+examining the call stack
- What debugging tools do you usually use? What are good and bad about them?
+Visual Studio
+Covers all of the above. Has the ability to move backwards in a function; if you set a breakpoint & need to inspect what happened two lines earlier, you can drag it back and reply those lines. Conditional breakpoints. Can execute commands against the system's current state. Can modify the system's state & observer what impact that has.
+Doesn't have good support for "edit and continue," where you can change the inner workings of a method without needing to recompile & relaunch the application
- What are the challenges of debugging a sequential program? What are additional challenges of debugging a parallel program?
Debugging parallel programs can be extremely difficult due to heisenbugs that are timing related. These bugs happen only with certain interleavenings of the code, which are typically difficult to reproduce. The paper we read earlier in the semester on Chess from Microsoft Research is a good treatment of this problem and a possible solution.
- How important is replay for debugging?
Replay is essential for effective debugging. If the error cannot be consistently reproduced, it is often very difficult to diagnose and cure.
- If you need to design a deterministic replay system, how are you going to do it for sequential programs? Does it work for parallel programs? If now, how to make it work?
- What are the additional challenges to make your tool work on multi-core or multi-processor systems?
These questions seems like they would need a ridiculously long answer. The sequential problem is not too difficult, and you can refer to the Chess paper from earlier in the semester for the parallel case.
- Given a replay system, what else should we do to make it really help debugging?
Solve the halting problem.
- How would virtual machine technologies help?
Virtual machines could help by being able to completely capture every aspect of the system's state, providing you with a snapshot of the system's state that includes active processes, RAM utilization, etc. The environment can often have an impact on a system that behaves as expected internally, and those factors are often difficult to capture & especially difficult to repeat.
- What are the state of arts?
Not sure what the question is...
What the paper mentions in section 1.2 on "State of the Art" is based around systems to record executions. This is different from Chess, which did not attempt to record bugs but instead tried to find a way to efficiently and exhaustively examine different thread interleavenings. This seems to fall into the authors' bucket of existing software practices that repeatedly replay an execution to search for bugs caused by interleavenings. Granted, Chess should be drastically more efficient than the common naïve practices, but it is still in that category. PRES attempts to record enough information that a small amount of replays are needed to reproduce the bug, but it limits the amount of observations that are recorded in order to minimize the overhead of running a production system with PRES active.
PRES has three key components:
-recording only a subset of events. Although there may not be enough information to replay every intermediate event exactly as it occurred, enough information is recorded to reproduce the bug. Once PRES has reproduced the bug once, it can then consistently reproduce it 100% of the time.
-a replay system that reproduces unrecorded actions & events
-using data from unsuccessful replays to help guide subsequent attempts
by Mike (noreply@blogger.com) at November 10, 2009 07:40 PM
I think the funniest moment in this entire paper was the accurate yet obvious statement that if it takes a large amount of time to locate a bug in a program it will take a longer time to fix the bug. I understand the direction that this statement is heading towards: that concurrency bugs can be maddening to locate, and if it takes years to find them, it may take years just to correctly reproduce the conditions they occur under, and therefore it will take even more time to figure out how to protect against that condition. It is a very important point, as these concurrency issues are extremely important to solve before concurrency becomes standard.
What I find interesting is the approach of a lite-data retention in the first passes to build an outline of where the problems lie. I remember a previous paper where we were using technical abilities to predict areas of conflict in concurrency. This seems like a great possibility for combination of tools in order to better produce techniques for correct development. Again it seems that we have a great idea, with good technical setup and a good tool base, but since it succeeds at its own aims the authors haven't looked beyond to create a stronger technique for us to use to solve these complex problems with greater efficacy.
by Will Vesely (noreply@blogger.com) at November 10, 2009 07:35 PM
The first thing that I would like to talk about is shared queue. The first aspect is the context in which this could occur. I have come across many graph mining problems which run serially on single units of execution. Parallelizing this process using the shared queue could be an interesting research area to look into. I completely agree with the notion that more complicated constructs could increase the error-rate of the code structure. In the solution, the most important point would be about defining the shared ADT since the rest of the process would depend on it. Again, as Prof.Johnson mentioned in today's class, this paper has a lot of code, and seems to be better written than some of the other patterns that we read earlier. It will be interesting to read more about the choice of nested locks here. The second paper is about speculation. This seemed to me to be the most interesting of the three patterns. The commit/recovery mechanism depending upon the status of a predicate seems to be the most complicated of the tasks mentioned in the paper. I would love to see more elaboration on the recovery mechanism especially on the description of more intelligent recovery mechanism mentioned in the paper. I searched for more information and examples regarding this pattern but I couldnt find that many.
The last paper is about digital circuits pattern. This paper seemed the one requiring the most introspection among the three. Among the examples given, I could relate most to the databases pattern because I have been working on this but not on the scale mentioned in the paper. For most bit level operations that I have encountered in C, I have had to intelligently use up the remaining bits in a word so as to not waste memory. Could we say that bit level parallelism is an 'embarrassingly parallel' problem? Among the examples presented in the paper I found the description of finding the nth gray code to be very interesting. The fitting of data completely as a word is the key aspect here. Overall, I think the three patterns in this set are some of the most introspective that I have read this entire semester.
by Chandra (noreply@blogger.com) at November 10, 2009 06:57 PM
Loop Parallelism
--------------
1. The author says the goal of incremental parallelization is to "evolve" a sequential program into a parallel program. If you have experience in refactoring code from sequential to parallel, how did your approach differ from the steps presented in the paper?
I do not have experience doing this, but I do not imagine that you would have great results. I believe we talked about how evolutionary parallelism can achieve minor performance gains, but the architecture really needs to support the parallelism using the appropriate patterns for the problem domain for the speedups to be significant. Amdahl's Law says that we are limited by the amount of code that must be done sequentially, and I imagine refactoring loops will not parallelize a large percentage of a typical codebase. Though I am not a fan of premature optimization, there is a threshold above which you can be certain that an area of the code will be significantly computationally expensive relative to the rest of the code & look for parallelization opportunities there.
2. Many of us have used Eclipse or another IDE, with their respective plugins, for execution profiling. What has been your experience with these tools? What is your preferred method of locating the code in question (automatic or manual)?
I have had very good success with profiling tools in the .NET world. They have helped easily identify down to the line where performance bottlenecks are.
3. The author states, "The application of many transformations to many loops may be necessary before efficient performance is achieved." What is your opinion on this statement?
That is true. See my response to 1.
Task Queue Implementation Pattern
------------------------------
1. How would you compare the Task Queue pattern to other similar distribution patterns like divide-and-conquer and fork/join pattern?
I think the task queue is by definition a divide-and-conquer algorithm. Otherwise, there would be no need for a queue because there would be only one, large task. Perhaps James meant recursive splitting & fork-join.
The distinction with recursive splitting is that here the work items in the queue are broken down into their smallest-level pieces, which are then grabbed by threads that need more work. The problem that recursive splitting has that the task queue does not is that some effort should be spent to make sure that the splitting points for dividing the work into recursive tasks should split the work-load evenly. Otherwise, it can be quite easy for the load to become unbalanced, e.g. a worst-case quicksort that takes O(N^2) instead of the average O(N log N).
This pattern interests me because I have read several blog posts recently talking about it in different contexts. The typical example for explaining its optimality in queuing theory is people queuing at a grocery store. If customers divide themselves and form N queues for N cashiers, customers can become blocked for an inordinately long time behind a difficult/slow customer. In addition, adding and removing cashiers causes a significant amount of churn in the customers. By contrast, if all customers form a single queue and the customer at the head goes to the first available cashier, no customer is unfairly stuck behind another difficult/slow one, and cashiers can be added and removed depending on the queue length without affecting the waiting customers. The former is the distributed queue version and the later is the centralized queue version of this pattern.
Perhaps my memory is failing me a bit, but fork-join seems like a specific implementation of the decentralized version task queue pattern. In fact, the distributed task-stealing pattern that was first presented in the fork-join paper is explicitly mentioned here.
2. It was nice to see a pattern with some actual code examples. How important were the examples in understanding the concepts presented in this paper?
I did not find the code samples useful. I thought that the concept was explained quite well in the first few sections of the paper; better than most of the others, in fact.
Graph Partitioning Implementation Strategy Pattern
-------------------------------------------
1. This was probably one of the more math-oriented patterns I have seen and/or read. How often have you experienced a situation where the graph partition strategy would have been useful? If never, can you think of any future situations where this pattern might be come in handy and could be applied?
I felt that the majority of the coverage was dedicated to the nitty-gritty details of several graph-partitioning algorithms. I think more high-level analysis would have been beneficial. That being said, I do think that this is a useful pattern to have in your toolbox because of the large number of problems that could be mapped to a set of nodes and weighted, optionally directed edges. Examples could be physical distance between points, latency between nodes on a network, or monetary costs to ship a package between two cities.
by Mike (noreply@blogger.com) at November 10, 2009 03:48 PM
Powered by Planet!
Last updated: January 24, 2012 11:10 AM