Dilbert





The Daily WTF

Error'd: All Natural Errors;

"I'm glad the asdf is vegan. I'm really thinking of going for the asasdfsadf, though. With a name like that, you know it's got to be 2 1/2 times as good for you," writes VJ.

 

Phil G. wrote, "Get games twice as fast with Epic's new multidimensional downloads!"

 

"But...it DOES!" Zed writes.

 

John M. wrote, "I appreciate the helpful suggestion, but I think I'll take a pass."

 

"java.lang.IllegalStateException...must be one of those edgy indie games! I just hope it's not actually illegal," writes Matthijs .

 

"For added flavor, I received this reminder two hours after I'd completed my checkout and purchased that very same item_name," Aaron K. writes.

 

[Advertisement] Keep the plebs out of prod. Restrict NuGet feed privileges with ProGet. Learn more.


CodeSOD: A Slow Moving Stream;

We’ve talked about Java’s streams in the past. It’s hardly a “new” feature at this point, but its blend of “being really useful” and “based on functional programming techniques” and “different than other APIs” means that we still have developers struggling to figure out how to use it.

Jeff H has a co-worker, Clarence, who is very “anti-stream”. “It creates too many copies of our objects, so it’s terrible for memory, and it’s so much slower. Don’t use streams unless you absolutely have to!” So in many a code review, Jeff submits some very simple, easy to read, and fast-performing bit of stream code, and Clarence objects. “It’s slow. It wastes memory.”

Sometimes, another team member goes to bat for Jeff’s code. Sometimes they don’t. But then, in a recent review, Clarence submitted his own bit of stream code.

schedules.stream().forEach(schedule -> visitors.stream().forEach(scheduleVisitor -> {
    scheduleVisitor.visitSchedule(schedule);

    if (schedule.getDays() != null && !schedule.getDays().isEmpty()) {
        schedule.getDays().stream().forEach(day -> visitors.stream().forEach(dayVisitor -> {
            dayVisitor.visitDay(schedule, day);

            if (day.getSlots() != null && !day.getSlots().isEmpty()) {
                day.getSlots().stream().forEach(slot -> visitors.stream().forEach(slotVisitor -> {
                    slotVisitor.visitSlot(schedule, day, slot);
                }));
            }
        }));
    }
}));

That is six nested “for each” operations, and they’re structured so that we iterate across the same list multiple times. For each schedule, we look at each visitor on that schedule, then we look at each day for that schedule, and then we look at every visitor again, then we look at each day’s slots, and then we look at each visitor again.

Well, if nothing else, we understand why Clarence thinks the Java Streams API is slow. This code did not pass code review.

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.


CodeSOD: A Private Code Review;

Jessica has worked with some cunning developers in the past. To help cope with some of that “cunning”, they’ve recently gone out searching for new developers.

Now, there were some problems with their job description and salary offer, specifically, they were asking for developers who do too much and get paid too little. Which is how Jessica started working with Blair. Jessica was hoping to staff up her team with some mid-level or junior developers with a background in web development. Instead, she got Blair, a 13+ year veteran who had just started doing web development in the past six months.

Now, veteran or not, there is a code review process, so everything Blair does goes through code review. And that catches some… annoying habits, but every once in awhile, something might sneak through. For example, he thinks static is a code smell, and thus removes the keyword any time he sees it. He’ll rewrite most of the code to work around it, except once the method was called from a cshtml template file, so no one discovered that it didn’t work until someone reported the error.

Blair also laments that with all the JavaScript and loosely typed languages, kids these days don’t understand the importance of separation of concerns and putting a barrier between interface and implementation. To prove his point, he submitted his MessageBL class. BL, of course, is to remind you that this class is “business logic”, which is easy to forget because it’s in an assembly called theappname.businesslogic.

Within that class, he implemented a bunch of data access methods, and this pair of methods lays out the pattern he followed.

public async Task<LinkContentUpdateTrackingModel> GetLinkAndContentTrackingModelAndUpdate(int id, Msg msg)
{
    return await GetLinkAndContentTrackingAndUpdate(id, msg);
}

/// <summary>
/// LinkTrackingUpdateLinks
/// returns: HasAnalyticsConfig, LinkTracks, ContentTracks
/// </summary>
/// <param name="id"></param>
/// <param name="msg"></param>
private async Task<LinkContentUpdateTrackingModel> GetLinkAndContentTrackingAndUpdate(int id, Msg msg)
{
  //snip
}

Here, we have one public method, and one private method. Their names, as you can see, are very similar. The public method does nothing but invoke the private method. This public method is, in fact, the only place the private method is invoked. The public method, in turn, is called only twice, from one controller.

This method also doesn’t ever need to be called, because the same block of code which constructs this object also fetches the relevant model objects. So instead of going back to the database with this thing, we could just use the already fetched objects.

But the real magic here is that Blair was veteran enough to know that he should put some “thorough” documentation using Visual Studio’s XML comment features. But he put the comments on the private method.

Jessica was not the one who reviewed this code, but adds:

I won’t blame the code reviewer for letting this through. There’s only so many times you can reject a peer review before you start questioning yourself. And sometimes, because Blair has been here so long, he checks code in without peer review as it’s a purely manual process.

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.


A Massive Leak;

"Memory leaks are impossible in a garbage collected language!" is one of my favorite lies. It feels true, but it isn't. Sure, it's much harder to make them, and they're usually much easier to track down, but you can still create a memory leak. Most times, it's when you create objects, dump them into a data structure, and never empty that data structure. Usually, it's just a matter of finding out what object references are still being held. Usually.

A few months ago, I discovered a new variation on that theme. I was working on a C# application that was leaking memory faster than bad waterway engineering in the Imperial Valley.

A large, glowing, computer-controlled chandelier

I don't exactly work in the "enterprise" space anymore, though I still interact with corporate IT departments and get to see some serious internal WTFs. This is a chandelier we built for the Allegheny Health Network's Cancer Institute which recently opened in Pittsburgh. It's 15 meters tall, weighs about 450kg, and is broken up into 30 segments, each with hundreds of addressable LEDs in a grid. The software we were writing was built to make them blink pretty.

Each of those 30 segments is home to a single-board computer with their GPIO pins wired up to addressable LEDs. Each computer runs a UDP listener, and we blast them with packets containing RGB data, which they dump to the LEDs using a heavily tweaked version of LEDScape.

This is our standard approach to most of our lighting installations. We drop a Beaglebone onto a custom circuit board and let it drive the LEDs, then we have a render-box someplace which generates frame data and chops it up into UDP packets. Depending on the environment, we can drive anything from 30-120 frames per second this way (and probably faster, but that's rarely useful).

Apologies to the networking folks, but this works very well. Yes, we're blasting many megabytes of raw bitmap data across the network, but we're usually on our own dedicated network segment. We use UDP because, well, we don't care about the data that much. A dropped packet or an out of order packet isn't going to make too large a difference in most cases. We don't care if our destination Beaglebone is up or down, we just blast the packets out onto the network, and they get there reliably enough that the system works.

Now, normally, we do this from Python programs on Linux. For this particular installation, though, we have an interactive kiosk which provides details about cancer treatments and patient success stories, and lets the users interact with the chandelier in real time. We wanted to show them a 3D model of the chandelier on the screen, and show them an animation on the UI that was mirrored in the physical object. After considering our options, we decided this was a good case for Unity and C#. After a quick test of doing multitouch interactions, we also decided that we shouldn't deploy to Linux (Unity didn't really have good Linux multitouch support), so we would deploy a Windows kiosk. This meant we were doing most of our development on MacOS, but our final build would be for Windows.

Months go by. We worked on the software while building the physical pieces, which meant the actual testbed hardware wasn't available for most of the development cycle. Custom electronics were being refined and physical designs were changing as we iterated to the best possible outcome. This is normal for us, but it meant that we didn't start getting real end-to-end testing until very late in the process.

Once we started test-hanging chandelier pieces, we started basic developer testing. You know how it is: you push the run button, you test a feature, you push the stop button. Tweak the code, rinse, repeat. Eventually, though, we had about 2/3rds of the chandelier pieces plugged in, and started deploying to the kiosk computer, running Windows.

We left it running, and the next time someone walked by and decided to give the screen a tap… nothing happened. It was hung. Well, that could be anything. We rebooted and checked again, and everything seems fine, until a few minutes later, when it's hung… again. We checked the task manager- which hey, everything is really slow, and sure enough, RAM is full and the computer is so slow because it's constantly thrashing to disk.

We're only a few weeks before we actually have to ship this thing, and we've discovered a massive memory leak, and it's such a sudden discovery that it feels like the draining of Lake Agassiz. No problem, though, we go back to our dev machines, fire it up in the profiler, and start looking for the memory leak.

Which wasn't there. The memory leak only appeared in the Windows build, and never happened in the Mac or Linux builds. Clearly, there must be some different behavior, and it must be around object lifecycles. When you see a memory leak in a GCed language, you assume you're creating objects that the GC ends up thinking are in use. In the case of Unity, your assumption is that you're handing objects off to the game engine, and not telling it you're done with them. So that's what we checked, but we just couldn't find anything that fit the bill.

Well, we needed to create some relatively large arrays to use as framebuffers. Maybe that's where the problem lay? We keep digging through the traces, we added a bunch of profiling code, we spent days trying to dig into this memory leak…

… and then it just went away. Our memory leak just became a Heisenbug, our shipping deadline was even closer, and we officially knew less about what was going wrong than when we started. For bonus points, once this kiosk ships, it's not going to be connected to the Internet, so if we need to patch the software, someone is going to have to go onsite. And we aren't going to have a suitable test environment, because we're not exactly going to build two gigantic chandeliers.

The folks doing assembly had the whole chandelier built up, hanging in three sections (we don't have any 14m tall ceiling spaces), and all connected to the network for a smoke test. There wasn't any smoke, but they needed to do more work. Someone unplugged a third of the chandelier pieces from the network.

And the memory leak came back.

We use UDP because we don't care if our packet sends succeed or not. Frame-by-frame, we just want to dump the data on the network and hope for the best. On MacOS and Linux, our software usually uses a sender thread that just, at the end of the day, wraps around calls to the send system call. It's simple, it's dumb, and it works. We ignore errors.

In C#, though, we didn't do things exactly the same way. Instead, we used the .NET UdpClient object and it's SendAsync method. We assumed that it would do roughly the same thing.

We were wrong.

await client.SendAsync(packet, packet.Length, hostip, port);

Async operations in C# use Tasks, which are like promises or futures in other environments. It lets .NET manage background threads without the developer worrying about the details. The await keyword is syntactic sugar which lets .NET know that it can hand off control to another thread while we wait. While we await here, we don't actually await the results of the await, because again: we don't care about the results of the operation. Just send the packet, hope for the best.

We don't care- but Windows does. After a load of investigation, what we discovered is that Windows would first try and resolve the IP address. Which, if a host was down, obviously it couldn't. But Windows was friendly, Windows was smart, and Windows wasn't going to let us down: it kept the Task open and kept trying to resolve the address. It held the task open for 3 seconds before finally deciding that it couldn't reach the host and errored out.

An error which, as I stated before, we were ignoring, because we didn't care.

Still, if you can count and have a vague sense of the linear passage of time, you can see where this is going. We had 30 hosts. We sent each of the 30 packets every second. When one or more of those hosts were down, Windows would keep each of those packets "alive" for 3 seconds. By the time that one expired, 90 more had queued up behind it.

That was the source of our memory leak, and our Heisenbug. If every Beaglebone was up, we didn't have a memory leak. If only one of them was down, the leak was pretty slow. If ten or twenty were out, the leak was a waterfall.

I spent a lot of time reading up on Windows networking after this. Despite digging through the socket APIs, I honestly couldn't figure out how to defeat this behavior. I tried various timeout settings. I tried tracking each task myself and explicitly timing them out if they took longer than a few frames to send. I was never able to tell Windows, "just toss the packet and hope for the best".

Well, my co-worker was building health monitoring on the Beaglebones anyway. While the kiosk wasn't going to be on the Internet via a "real" Internet connection, we did have a cellular modem attached, which we could use to send health info, so getting pings that say "hey, one of the Beaglebones failed" is useful. So my co-worker hooked that into our network sending layer: don't send frames to Beaglebones which are down. Recheck the down Beaglebones every five minutes or so. Continue to hope for the best.

This solution worked. We shipped. The device looks stunning, and as patients and guests come to use it, I hope they find some useful information, a little joy, and maybe some hope while playing with it. And while there may or may not be some ugly little hacks still lurking in that code, this was the one thing which made me say: WTF.

[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!


CodeSOD: A Unique Choice;

There are many ways to mess up doing unique identifiers. It's a hard problem, and that's why we've sorta agreed on a few distinct ways to do it. First, we can just autonumber. Easy, but it doesn't always scale that well, especially in distributed systems. Second, we can use something like UUIDs: mix a few bits of real data in with a big pile of random data, and you can create a unique ID. Finally, there are some hashing-related options, where the data itself generates its ID.

Tiffanie was digging into some weird crashes in a database application, and discovered that their MODULES table couldn't decide which was correct, and opted for two: MODULE_ID, an autonumbered field, and MODULE_UUID, which one would assume, held a UUID. There were also the requsite MODULE_NAME and similar fields. A quick scan of the table looked like:

MODULE_ID MODULE_NAME MODULE_UUID MODULE_DESC
0 Defects 8461aa9b-ba38-4201-a717-cee257b73af0 Defects
1 Test Plan 06fd18eb-8214-4431-aa66-e11ae2a6c9b3 Test Plan

Now, using both UUIDs and autonumbers is a bit suspicious, but there might be a good reason for that (the UUIDs might be used for tracking versions of installed modules, while the ID is the local database-reference for that, so the ID shouldn't change ever, but the UUID might). Still, given that MODULE_NAME and MODULE_DESC both contain exactly the same information in every case, I suspect that this table was designed by the Department of Redunancy Department.

Still, that's hardly the worst sin you could commit. What would be really bad would be using the wrong datatype for a column. This is a SQL Server database, and so we can safely expect that the MODULE_ID is numeric, the MODULE_NAME and MODULE_DESC must be text, and clearly the MODULE_UUID field should be the UNIQUEIDENTIFIER type, right?

Well, let's look at one more row from this table:

MODULE_ID MODULE_NAME MODULE_UUID MODULE_DESC
11 Releases Releases does not have a UUID Releases

Oh, well. I think I have a hunch what was causing the problems. Sure enough, the program was expecting the UUID field to contain UUIDs, and was failing when a field contained something that couldn't be converted into a UUID.

[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!


Error'd: Please Reboot Faster, I Can't Wait Any Longer;

"Saw this at a German gas station along the highway. The reboot screen at the pedestal just kept animating the hourglass," writes Robin G.

 

"Somewhere, I imagine there's a large number of children asking why their new bean bag is making them feel hot and numb," Will N. wrote.

 

Joel B. writes, "I came across these 'deals' on the Microsoft Canada store. Normally I'd question it, but based on my experiences with Windows, I bet, to them, the math checks out."

 

Kyle H. wrote, "Truly, nothing but the best quality strip_zeroes will be accepted."

 

"My Nan is going to be thrilled at the special discount on these masks!" Paul R. wrote.

 

Paul G. writes, "I know it seemed like the hours were passing more slowly, and thanks to Apple, I now know why."

 

[Advertisement] ProGet’s got you covered with security and access controls on your NuGet feeds. Learn more.


CodeSOD: A Variation on Nulls;

Submitter “NotAThingThatHappens” stumbled across a “unique” way to check for nulls in C#.

Now, there are already a few perfectly good ways to check for nulls. variable is null, for example, or use nullable types specifically. But “NotAThingThatHappens” found approach:

if(((object)someObjThatMayBeNull) is var _)
{
    //object is null, react somehow
} 
else
{ 
  UseTheObjectInAMethod(someObjThatMayBeNull);
}

What I hate most about this is how cleverly it exploits the C# syntax to work.

Normally, the _ is a discard. It’s meant to be used for things like tuple unpacking, or in cases where you have an out parameter but don’t actually care about the output- foo(out _) just discards the output data.

But _ is also a perfectly valid identifier. So var _ creates a variable _, and the type of that variable is inferred from context- in this case, whatever type it’s being compared against in someObjThatMayBeNull. This variable is scoped to the if block, so we don’t have to worry about it leaking into our namespace, but since it’s never initialized, it’s going to choose the appropriate default value for its type- and for reference types, that default is null. By casting explicitly to object, we guarantee that our type is a reference type, so this makes sure that we don’t get weird behavior on value types, like integers.

So really, this is just an awkward way of saying objectCouldBeNull is null.

NotAThingThatHappens adds:

The code never made it to production… but I was surprised that the compiler allowed this.
It’s stupid, but it WORKS!

It’s definitely stupid, it definitely works, I’m definitely glad it’s not in your codebase.

[Advertisement] Otter - Provision your servers automatically without ever needing to log-in to a command prompt. Get started today!