Tuesday, May 24, 2022

Software Engineering: Crash Course Computer Science #16

Hi, I’m Carrie Anne, and welcome to CrashCourseComputer Science! So we’ve talked a lot about sorting in thisseries and often code to sort a list of numbers might only be ten lines long, which is easyenough for a single programmer to write. Plus, it’s short enough that you don’tneed any special tools – you could do it in Notepad. Really! But, a sorting algorithm isn’t a program;it’s likely only a small part of a much larger program. For example, Microsoft Office has roughly40 millions lines of code. .

40 MILLION! That’s way too big for any one person tofigure out and write! To build huge programs like this, programmersuse a set of tools and practices. Taken together, these form the disciplineof Software Engineering – a term coined by engineer Margaret Hamilton, who helpedNASA prevent serious problems during the Apollo missions to the moon. She once explained it this way: “It’skind of like a root canal: you waited till the end, [but] there are things you couldhave done beforehand. It’s like preventative healthcare, but it’spreventative software.” .

INTRO As I mentioned in episode 12, breaking bigprograms into smaller functions allows many people to work simultaneously. They don’t have to worry about the wholething, just the function they’re working on. So, if you’re tasked with writing a sortalgorithm, you only need to make sure it sorts properly and efficiently. However, even packing code up into functionsisn’t enough. Microsoft Office probably contains hundredsof thousands of them. That’s better than dealing with 40 millionlines of code, but it’s still way too many .

“things” for one person or team to manage. The solution is to package functions intohierarchies, pulling related code together into “objects”. For example, car’s software might have severalfunctions related to cruise control, like setting speed, nudging speed up or down, andstopping cruise control altogether. Since they’re all related, we can wrap themup into a unified cruise control object. But, we don’t have to stop there, cruisecontrol is just one part of the engine’s software. There might also be sets of functions thatcontrol spark plug ignition, fuel pumps, and .

The radiator. So we can create a “parent” Engine Objectthat contains all of these “children” objects. In addition to children *objects*, the engineitself might have its *own* functions. You want to be able to stop and start it,for example. It’ll also have its own variables, likehow many miles the car has traveled. In general, objects can contain other objects,functions and variables. And of course, the engine is just one partof a Car Object. There’s also the transmission, wheels, doors,windows, and so on. .

Now, as a programmer, if I want to set thecruise control, I navigate down the object hierarchy, from the outermost objects to moreand more deeply nested ones. Eventually, I reach the function I want totrigger: “Car, then engine, then cruise control, then set cruise speed to 55”. Programming languages often use somethingequivalent to the syntax shown here. The idea of packing up functional units intonested objects is called Object Oriented Programming. This is very similar to what we’ve doneall series long: hide complexity by encapsulating low-level details in higher-order components. Before we packed up things like transistorcircuits into higher-level boolean gates. .

Now we’re doing the same thing with software. Yet again, it’s a way to move up a new levelof abstraction! Breaking up a big program, like a car’ssoftware, into functional units is perfect for teams. One team might be responsible for the cruisecontrol system, and a single programmer on that team tackles a handful of functions. This is similar to how big, physical thingsare built, like skyscrapers. You’ll have electricians running wires,plumbers fitting pipes, welders welding, painters painting, and hundreds of other people teemingall over the hull. .

They work together on different parts simultaneously,leveraging their different skills. Until one day, you’ve got a whole workingbuilding! But, returning to our cruise control example…its code is going to have to make use of functions in other parts of the engine’s software,to, you know, keep the car at a constant speed. That code isn’t part of the cruise controlteam’s responsibility. It’s another team’s code. Because the cruise control team didn’t writethat, they’re going to need good documentation about what each function in the code does,and a well-defined Application Programming Interface — or API for short. .

You can think of an API as the way that collaboratingprogrammers interact across various parts of the code. For example, in the IgnitionControl object,there might be functions to set the RPM of the engine, check the spark plug voltage,as well as fire the individual spark plugs. Being able to set the motor’s RPM is reallyuseful, the cruise control team is going to need to call that function. But, they don’t know much about how theignition system works. It’s not a good idea to let them call functionsthat fire the individual spark plugs. Or the engine might explode! .

Maybe. The API allows the right people access tothe right functions and data. Object Oriented Programming languages do thisby letting you specify whether functions are public or private. If a function is marked as “private”,it means only functions inside that object can call it. So, in this example, only other functionsinside of IgnitionControl, like the setRPM function, can fire the sparkplugs. On the other hand, because the setRPM functionis marked as public, other objects can call .

It, like cruise control. This ability to hide complexity, and selectivelyreveal it, is the essence of Object Oriented Programming, and it’s a powerful and popularway to tackle building large and complex programs. Pretty much every piece of software on yourcomputer, or game running on your console, was built using an Object Oriented ProgrammingLanguage, like C++, C# or Objective-C. Other popular “OO” languages you may have heardof are Python and Java. It’s important to remember that code, beforebeing compiled, is just text. As I mentioned earlier, you could write codein Notepad or any old word processor. Some people do. .

But generally, today’s software developersuse special-purpose applications for writing programs, ones that integrate many usefultools for writing, organizing, compiling and testing code. Because they put everything you need in oneplace, they’re called Integrated Development Environments, or IDEs for short. All IDEs provide a text editor for writingcode, often with useful features like automatic color-coding to improve readability. Many even check for syntax errors as you type,like spell check for code. Big programs contain lots of individual sourcefiles, so IDEs allow programmers to organize .

And efficiently navigate everything. Also built right into the IDE is the abilityto compile and run code. And if your program crashes, because it’sstill a work in progress, the IDE can take you back to the line of code where it happened,and often provide additional information to help you track down and fix the bug, whichis a process called debugging. This is important because most programmersspend 70 to 80% of their time testing and debugging, not writing new code. Good tools, contained in IDEs, can go a longway when it comes to helping programmers prevent and find errors. .

Many computer programmers can be pretty loyalto their IDEs though – but let’s be honest. VIM is where it’s at. Providing you know how to quit. In addition to coding and debugging, anotherimportant part of a programmer's job is documenting their code. This can be done in standalone files called“read-me’s” which tell other programmers to read that help file before diving in. It can also happen right in the code itselfwith comments. These are specially-marked statements thatthe program knows to ignore when the code .

Is compiled. They exist only to help programmers figureout what’s what in the source code. Good documentation helps programmers whenthey revisit code they haven’t seen for awhile, but it’s also crucial for programmerswho are totally new to it. I just want to take a second here and reiteratethat it’s THE WORST when someone parachutes a load of uncommented and undocumented codeinto your lap, and you literally have to go line by line to understand what the code isdoing. Seriously. Don’t be that person. .

Documentation also promotes code reuse. So, instead of having programmers constantlywrite the same things over and over, they can track down someone else’s code thatdoes what they need. Then, thanks to documentation, they can putit to work in their program, without ever having to read through the code. “Read the docs” as they say. In addition to IDEs, another important pieceof software that helps big teams work collaboratively on big coding projects is called Source Control, also known as version control or revision control. Most often, at a big software company likeApple or Microsoft, code for projects is stored .

On centralized servers, called a code repository. When a programmer wants to work on a pieceof code, they can check it out, sort of like checking out a book out from a library. Often, this can be done right in an IDE. Then, they can edit this code all they wanton their personal computer, adding new features and testing if they work. When the programmer is confident their changesare working and there are no loose ends, they can check the code back into the repository,known as committing code, for everyone else to use. .

While a piece of code is checked out, andpresumably getting updated or modified, other programmers leave it alone. This prevents weird conflicts and duplicatedwork. In this way, hundreds of programmers can besimultaneously checking in and out pieces of code, iteratively building up huge systems. Critically, you don’t want someone committingbuggy code, because other people and teams may rely on it. Their code could crash, creating confusionand lost time. The master version of the code, stored onthe server, should always compile without .

Errors and run with minimal bugs. But sometimes bugs creep in. Fortunately, source control software keepstrack of all changes, and if a bug is found, the whole code, or just a piece, can be rolledback to an earlier, stable version. It also keeps track of who made each change,so coworkers can send nasty, I mean, helpful and encouraging emails to the offending person. Debugging goes hand in hand with writing code,and it’s most often done by an individual or small team. The big picture version of debugging is QualityAssurance testing, or QA. .

This is where a team rigorously tests outa piece of software, attempting to create unforeseen conditions that might trip it up. Basically, they elicit bugs. Getting all the wrinkles out is a huge effort,but vital in making sure the software works as intended for as many users in as many situationsas imaginable before it ships. You’ve probably heard of beta software Thisis a version of software that’s mostly complete, but not 100% fully tested. Companies will sometimes release beta versionsto the public to help them identify issues, it’s essentially like getting a free QAteam. .

What you don’t hear about as much is the version that comes before the beta: the alpha version. This is usually so rough and buggy, it’sonly tested internally. So, that’s the tip of the iceberg in termsof the tools, tricks and techniques that allow software engineers to construct the huge piecesof software that we know and love today, like YouTube, Grand Theft Auto 5, and Powerpoint. As you might expect, all those millions oflines of code needs some serious processing power to run at useful speeds, so next episode we’ll be talking about how computers got so incredibly fast. See you then.

RELATED ARTICLES

Most Popular