Dilbert





The Daily WTF

CodeSOD: Switch On Suppression;

Krista noticed our article explaining that switches were replacements for ifs. She sent in a version she found in her codebase, around the same idea:

	@SuppressWarnings("incomplete-switch")
	@Transactional
	public void removeAssetFromPackage(Package pkg, Asset assetToRemove) {
		pkg.getAssets().remove(assetToRemove);
		// Delete from DB and asset store.
		removeAsset(pkg, assetToRemove);

		// If we're removing LIVE asset, also delete AsyncJobs.
		switch (assetToRemove.getType()) {
			case LIVE:
				asyncJobService.removeAsyncJobsForPresentation(pkg);
				break;
		}

		// Flush package cache.
		cacheInvalidationService.invalidatePresenationCache(pkg);
	}

Once again, we use a switch instead of an if. Perhaps this was premature flexibility- there are obviously other states the getType method could return. Maybe, someday in the future, they’ll need to do other things inside that switch. Until then, it’s just the sort of thing the compiler will throw warnings about, since there’s no default case.

Oh, except, of course, they suppressed the warning up top.

A quick search in Krista’s codebase for @SuppressWarnings("incomplete-switch") finds dozens of usages of this pattern.

[Advertisement] Forget logs. Next time you're struggling to replicate error, crash and performance issues in your apps - Think Raygun! Installs in minutes. Learn more.


CodeSOD: The Secure Cloud API;

Melinda's organization has purchased a cloud-based storage system. Like any such system, it has a lovely API which lets you manage quotas and login tokens. It also had a lovely CLI, which was helpful for administrators to modify the cloud environment. Melinda's team built a PHP front-end that could not only manage files, but also allowed administrators to manage those quotas.

Melinda was managing those quotas, and when she clicked the link to view the quotas, she noticed the URL contained ?token=RO-cmV1c2luZyBrZXlzIGlzIFRSV1RG. When she went to modify the quota, the URL parameter became ?token=RW-cmV1c2luZyBrZXlzIGlzIFRSV1RG. That looked like a security key for their cloud API, transmitted in the open. The RW and RO looked like they had something to do with readwrite and readonly, but that wasn't the security model their storage provider used. When Melinda had another co-worker log in, they saw the same tokens. What was going on?

Melinda took a look at the authorization code.

function authorised($token, $verb) { // check if token can do verb // TO DO - turn this in to a database lookup $rights = array( 'RW-cmV1c2luZyBrZXlzIGlzIFRSV1RG' => array('getquota' => 0, 'setquota' => 1), 'RO-cmV1c2luZyBrZXlzIGlzIFRSV1RG' => array('getquota' => 1, 'setquota' => 0) ); return ((isset($rights[$token]) && isset($rights[$token][$verb]) && ($rights[$token][$verb] == 1))); }

The developer behind this wrote their own security model, instead of using the one their storage provider offered. The tokens here were hard-coded secret keys for the API. Essentially this meant no matter who logged in to manage quotas on the application side, the storage system saw them all as a single user- a single user with pretty much unlimited permissions that had root access on every VM in their cloud environment.

"Oh, boy, they released their secret key to essentially a root account, that's bad," you say. It gets worse. The code doesn't, but the logic does.

You see, the culprit here didn't want to learn the API. So they did everything via shell commands. They had one machine set up in the cloud environment with a bunch of machine-based permissions that allowed it to control anything in the cloud. When someone wanted to change quotas, the PHP code would shell out and use SSH to log into that cloud machine as root, and then run the cloud vendor's proprietary CLI tool from within their own environment.

[Advertisement] ProGet supports your applications, Docker containers, and third-party packages, allowing you to enforce quality standards across all components. Download and see how!


Error'd: Stay Away From California;

"Deep down, I knew this was one of the most honest labels I've ever seen," wrote Bob E.

 

"Some people struggle to get and keep a high credit score. I, on the other hand, appear to have gotten a high score," writes Paweł.

 

Steve M. wrote, "I'm trying to register our travel insurance with ROL Cruises, but our travel policy has been rejected because it's under age."

 

"This happens every time my mom tries to place a call in her car," writes Dylan S., "Strangely enough, the call still goes through."

 

"While I appreciate the one year warranty, I don't think I will be using these tea cakes to replace my HP battery," J.R. wrote.

 

Neil D. writes, "So, am I supposed to enter -1 and then I can buy one?"

 

[Advertisement] Forget logs. Next time you're struggling to replicate error, crash and performance issues in your apps - Think Raygun! Installs in minutes. Learn more.


Crazy Like a Fox(Pro);

“Database portability” is one of the key things that modern data access frameworks try and ensure for your application. If you’re using an RDBMS, the same data access layer can hopefully work across any RDBMS. Of course, since every RDBMS has its own slightly different idiom of SQL, and since you might depend on stored procedures, triggers, or views, you’re often tied to a specific database vendor, and sometimes a version.

Keulemans Chama fox.png

And really, for your enterprise applications, how often do you really change out your underlying database layer?

Well, for Eion Robb, it’s a pretty common occurrence. Their software, even their SaaS offering of it, allows their customers a great deal of flexibility in choosing a database. As a result, their PHP-based data access layer tries to abstract out the ugly details, they restrict themselves to a subset of SQL, and have a lot of late nights fighting through the surprising bugs.

The databases they support are the big ones- Oracle, SQL Server, MySQL, and FoxPro. Oh, there are others that Eion’s team supports, but it’s FoxPro that’s the big one. Visual FoxPro’s last version was released in 2004, and the last service pack it received was in 2007. Not many vendors support FoxPro, and that’s one of Eion’s company’s selling points to their customers.

The system worked, mostly. Until one day, when it absolutely didn’t. Their hosted SaaS offering crashed hard. So hard that the webserver spinlocked and nothing got logged. Eion had another late night, trying to trace through and figure out: which customer was causing the crash, and what were they doing?

Many hours of debugging and crying later, Eion tracked down the problem to some code which tracked sales or exchanges of product- transactions which might not have a price when they occur.

$query .= odbc_iif("SUM(price) = 0", 0, "SUM(priceact)/SUM(" . odbc_iif("price != 0", 1, 0) . ")") . " AS price_avg ";

odbc_iif was one of their abstractions- an iif function, aka a ternary. In this case, if the SUM(price) isn’t zero, then divide the SUM(priceact) by the number of non-zero prices in the price column. This ensures that there is at least one non-zero price entry. Then they can average out the actual price across all those non-zero price entries, ignoring all the “free” exchanges.

This line wasn’t failing all the time, which added to Eion’s frustration. It failed when two very specific things were true. The first factor was the database- it only failed in FoxPro. The second factor was the data- it only failed when the first product in the resultset had no entries with a price greater than zero.

Why? Well, we have to think about where FoxPro comes from. FoxPro’s design goal was to be a data-driven programming environment for non-programmers. Like a lot of those environments, it tries its best not to yell at you about types. In fact, if you’re feeding data into a table, you don’t even have to specify the type of the column- it will pick the “correct” type by looking at the first row.

So, look at the iif again. If the SUM(price) = 0 we output 0 in our resultset. Guess what FoxPro decides the datatype must be? A single digit number. If the second row has an average price of, say, 9.99, that’s not a single digit number, and FoxPro explodes and takes down everything else with it.

Eion needed to fix this in a way that didn’t break their “database agnostic” code, and thus would continue to work in FoxPro and all the other databases, with at least predictable errors (that don’t crash the whole system). In the moment, suffering through the emergency, Eion changed the code to this:

$query .= "SUM(priceact)/SUM(" . odbc_iif("price != 0", 1, 0) . ")") . " AS price_avg ";

Without the zero check, any products which had no sales would trigger a divide-by-zero error. This was a catchable, trappable error, even in FoxPro. Eion made the change in production, got the system back up and their customers happy, and then actually put the change in source control with a very apologetic commit message.

[Advertisement] Utilize BuildMaster to release your software with confidence, at the pace your business demands. Download today!


CodeSOD: Padding Your Time;

Today will be a simple one, and it’s arguably low-hanging fruit, because once again, it’s date handling code. But it’s not handling dates where it falls down. It falls down on something much more advanced: conditionals. Supplied by “_ek1n”.

if ($min == 0) {
    if ($hours == 12) {
        $hours = 12;
        $min   = '00';
    } else {
        $hours = $hours;
        $min   = '00';
    }
}

My favorite part is the type-conversion/padding: turning 0 into '00', especially the fact that the same padding doesn’t happen if $min is 1, or 3, or anything else less than 10. Or, it probably does happen, and it probably happens in a general fashion which would also pad out the 0- or maybe it doesn’t, and TRWTF is that their padding method can’t zero-pad a zero.

The most likely situation, though, is that this is just a brain fart.

[Advertisement] Otter - Provision your servers automatically without ever needing to log-in to a command prompt. Get started today!


CodeSOD: Wear a Dunder Cap;

In the Python community, one buzzword you’ll find thrown around is whether or not an approach is “pythonic”. It’s a flexible term, and something you can just throw out in code reviews, even if you’ve never written a line of Python in your life: “Is that Pythonic?”

The general rubric for what truly is “pythonic” is generally code that is simple and code that operates explicitly. There shouldn’t be any “magic”. But Python doesn’t force you to write “pythonic” code, and it provides loads of tools like decorators and metaclasses that let you get as complex and implicit as you like.

One bit of magic is the “dunder” methods, which are Python’s vague approach to operator overloading. If you want your class to support the [] operator, implement __getitem__. If you want your class to support the + operator, implement __add__. If you want your class to support access to any arbitrary property, implement __getattr__.

Yes, __getattr__ allows you to execute code on any property access, so a simple statement like this…

if self.connected:
  print(self.coffee)

…could involve twenty database calls and invoke a web service connected to a live coffee machine and literally make you a cup of coffee. At no point does the invoker ever see that they’ve called a method, it just happens by “magic”. And, for extra fun, there’s also a global method getattr, lets you wrap property access with a default, e.g., getattr(some_object, "property", False) will return the value of some_object.property, if it exists, or False if it doesn’t.

Whew, that’s a lot of Python internals, but that brings us to today’s anonymous submission.

Someone wrote some classes which might contain other classes, and wanted them to delegate property access to those, which means there are a number of classes containing this method:

def __getattr__(self, item):
    return self.nested.item

So, for example, you could call some_object.log_level, and this overridden __getattr__ would walk through the whole chain of objects to find where the log_level exists.

That’s fine, if the chain doesn’t contain any loops, where the same object appears in the chain multiple times. If, on the other hand, it does, you get a RecursionError when the loop finally exceeds the system recursion limit.

Our submitter found this a bit of a problem. When the access to log_level failed, it might take a long time before hitting the RecursionError- which, by the way, isn’t triggered by a stack overflow. Python tries to throw a RecursionError before you overflow the stack.

You can, but very much shouldn’t control that value. But our submitter didn’t want to wait for the thousands of calls it might take to get the RecursionError, so they wrapped their accesses to log_level in code that would tweak the recursion limit:

    # Protect against origin having overwritten __getattr__
    old_recursion_limit = sys.getrecursionlimit()
    n = 2
    while True:
        try:
            sys.setrecursionlimit(n)
        except RecursionError as e:
            n += 1
        break
    try:
        log_level = getattr(origin, "log_level", None)
    except RecursionError as e:
        object.__setattr__(origin, "log_level", None)
        log_level = getattr(origin, "log_level", None)
    sys.setrecursionlimit(old_recursion_limit)

So, first, we check the current recursion limit. Then, we try setting the recursion limit to two. If the current recursive depth is greater than two, then when we try to change the recursion limit it throws a RecursionError. So catch that, and then try again with one higher recursion limit.

Once we’ve set the recursion limit to one greater than our current recursion limit, we then try and fetch the log level. If we fail, we’ll just default to None, and finally return back to our original recursion limit.

This is an impressive amount of effort into constructing a footgun. From the use of __getattr__ in the first place, to the misplaced effort of short-circuiting recursion instead of preventing cycles in the first place, and finally to the abuse of setrecursionlimit and error handling to build… this. Absolutely nothing here should have happened. None of it.

Our submitter has confessed their sins. As penance, they’ll say twenty Hail Guidos, and fix this code. Remember, you can’t be absolved of your sins if you just keep on sinning.

[Advertisement] Continuously monitor your servers for configuration changes, and report when there's configuration drift. Get started with Otter today!


CodeSOD: Legacy Switchout;

About a decade ago, I attended a talk. The speaker made the argument that "legacy code" may have many possible interpretations, but the practical view was to simply think of legacy code as "code without unit tests". Thus, the solution to modernizing your legacy code was to simply write unit tests. Refactoring the code to make it testable would have the side effect of modernizing the code base, and writing tests would act as documentation. It's that easy.

Andrew is struggling with some legacy code right now. Worse, they're trying to integrate a large mountain of legacy code into a custom, in-house CI/CD pipeline. This particular pile of legacy code dates back to the mid-2000s, so everything in it is glued together via XML. It was some of that XML code which started failing when Andrew threw some unit tests at it.

It doesn't start that bad:

static bool IsNamespace(XName name) { return name.LocalName == "xmlns" || name.NamespaceName == "http://www.w3.org/2000/xmlns/"; }

Now, this is a silly method, and even .Net 1.0 had an XmlNamespaceManager for handling interacting with namespaces. Without knowing more of the overall context, maybe this method was actually useful? At the very least, this method isn't a WTF.

One thing I want you to note, however, is that this method does the sane thing, and simply returns the result of a boolean comparison. That method was written by the same developer, and checked in on the same day as this method:

static bool IsNewtonsoftNamespace(XAttribute attribute) { switch (attribute.Value) { case "http://james.newtonking.com/projects/json": return true; } switch (attribute.Name.NamespaceName) { case "http://james.newtonking.com/projects/json": return true; } return false; }

Between writing IsNamespace and IsNewtonsoftNamespace, the developer apparently forgot that they could just return boolean expressions. They also apparently forgot- or perhaps never knew- what an if statement is supposed to do.

Andrew at least has the code building in their CI solution, but can look forward to many more weeks of fixing test failures by going through code like this.

[Advertisement] Ensure your software is built only once and then deployed consistently across environments, by packaging your applications and components. Learn how today!