Skip to main content

(Tr|b)ash - Thoughts on Unix Scripting



Shell Scripting. The long and short of it is that bash is available for Unix systems, is relatively powerful, and for the most part, unchanging, scripting environment. Old Solaris Boxes, BSD machines, and every Linux box has bash available. Unlike python, if you have and older version of bash available, you are unlikely to even notice. So outside of a few people concerned with POSIX compliant shell scripting ( P1003.2 ) , you are probably doing your scripting in bash - its quite portable.

The Unix shell generally is quite powerful. A Unix shell is a macro processor than executes programs. Most, including bash, come with relatively sophisticated string processing, in bash including left and right hand striping, tokenizing, and even regular expressions. Special features for working with files including redirection and globbing, as well as building tests for file attributes (writable?, directory? into the conditional test builtin. ([[ -e $FILE ]]) . Finally, the pipelined model of data flow made possible by pipes, allows chaining commands to produce something arbitrarily complex. All of this power, presented simply, and incrementally, tends to encourage the creation of casual scripts, which empowers the user. So in brief, the shells strengths are its simplicity, pipes, and string manipulation.

Historically, there has been no effective competitor to the Unix CLI. Microsoft Windows CMD.exe command processor is harder to understand, and not even close to as capable. For instance, named pipes are impossible owing to limitations of the windows platform. Regular Expression or serious globing are not available. Woe to anyone learning “Batch”.

But, Bash is severely limited, and there are several new competitors for the doublet of shells and scripting languages on *nix. How do Python and Powershell stack up?
Bash is Trash

But, bash sucks.
No Return Values

Bash has functions and its functions imitate the interface of commands - they are called like any normal Unix command, and there arguments are processed like any normal Unix command. This was a design decision that is simple and beautiful - that also cripples functions. Unix commands don’t have return values like a normal programming language - they return an unsigned 8 bit status code. As you can imagine, the vast majority of useful functions want to return something else.

Bash has a couple of different hacks for this. You can return via standard output by printing your return value, and capturing it at the caller - the method I find most convenient. Return is effectively done via printf, and captured RET=$(fn arg arg arg). This obviously has the disadvantage of preventing the function from producing standard output - something relatively easy to live with.

Another choice is returning via Global Variable. This is generally frowned upon in procedural programming because it can create debugging nightmares, and debugging is not even close to a strength of bash - specially because bash tends to fail silently. In addition, global variables can couple your program tightly - All functions end up using the same globals, and not being usable in other scripts, or having to have serious integration done. Bash doesn’t worry much about this, because it doesn’t have namespaces or modules. Your supposed to use a real programming language at that point. God forbid your function call a function that ends up modifying its global to create a new function call. Globals are just a bad choice, but bash will happily allow you to do this.

You can also pass variables by name. Passing by name is something like a pointer in C. You can pass a reference to the variable, its name, and then modify the variable this way. I find that many functions want to return anonymous values, that are used in other calculations. Creating names for each variable used in a calculation is a pain in the ass.

No return values becomes a bigger deal when combined with the way arguments to bash functions are handled.
No real argument handling, types or data structures

Because of bash’s design decision to treat functions like commands, there are no argument types, or even fixed number of arguments. Unix commands receive an array of strings as arguments and that isn’t negotiable. Functions in programming languages generally don’t. Bash makes life as simple as possible by almost transparently converting between strings and numeric types in the correct contexts, but this isn’t really enough. Imagine that you want to pass an array to a function - normal enough behavior in a real program. There isn’t actually any kind of array, arguments are just passed as a long string, which you can parse into an array. This becomes problematic with more than one array. Bash will allow you to solve this problem via correct quoting, but this seems like a hack.

Really, the number of arguments and there types are not really fixed, and while that can be advantageous and easy to work with in some cases, it can be nightmarish in others. Effectively, you have to flatten and re-parse arrays through each function call.
No Libraries / Modules / Namespaces

Given that bash doesn’t have real argument handling, or return values for functions, its not all that surprising that it doesn’t have any kind of import, modules, libraries, or namespaces kind of functionality available. Bash scripts aren’t really meant to grow to beyond single file complexity. Functions are meant to return via globals, and be tightly coupled to the script.

Bash does afford two small luxuries here: local scoping, and sourcing. The local builtin allows you to ensure that your functions use of a variable doesn’t overwrite some global somewhere with the same name. This is exactly inverse to python which requires you opt-in to using a global rather than default assuming so. Although, we can’t have namespaces, our functions can have local variables. Although we can’t have libraries or modules to import, we can just run arbitrary bash code. Bash has a builtin called ‘source’ which allows use to run bash code in the current process, allowing use to keep functions in alternative files. Sourcing code dumps it all into the global namespace. You can mitigate this upfront by giving your functions long names that include a namespace identifier.
Bash Fails Silently

Bash is a nightmare to debug. There isn’t a debugger, but, you can run bash -x to have bash print debugging information for each line of the script it runs. What really makes debugging bash difficult, is that bash tends to fail silently. Bash doesn’t require you declare variables, and mistyping a variable name will result often result invalid commands that don’t do what you expect. Undefined variables default to an empty string or zero in numeric contexts - this can be disastrous for comparisons.
Syntax : Benefit or Drawback?

Bash has a(n) (in)famous syntax. The syntax isn’t intuitive, but is minimal. This turns out to be important for a shell. In accord with “Powershell In Action” description of Powershell’s “Elastic Syntax” feature, a shell should have syntax optimized for writing. Most shell commands are issued by the author for single use, and long term intelligibility are not worth trading for efficiency. Further, in accord with the Unix philosophy, targeting expert efficiency has more return, specially now with a lack of general users, for the Unix command line.

Bash’s syntax is minimal and efficient. This has draw backs for writing larger scripts where readability and maintenance are more important.
Powershell is Awesome

Powershell is a Microsoft’s attempt at replicating and improving on the success of the Unix CLI. As Microsoft has tried to expand into the server market, it has realized how the CLI allows automation in administration that is essential. Microsoft has tried to enable this automation by building its own command line environment called Powershell. Powershell is now open source, but a Microsoft product. Powershell aims to be both a shell and scripting language, and the syntax is inspired by bash. Powershell tends to resolve many of the problems with bash, by offers its own problems for Unix.
Elastic Syntax, Shell like syntax

In my opinion, the killer feature of Powershell is “Elastic Syntax”. The idea is that the syntax and verbosity expands or contracts according to the users needs : When using Powershell interactively as a shell, a user will likely want more abbreviations and less syntax. Implicitness and flexibility are a benefit. When using power shell as a scripting language, more syntax, more explicitness is desirable. For instance, variable typing allows type checking. In Powershell, variable types are optional features.

Powershell also unifies argument handling, names arguments. These names can be omitted, or abbreviated, for a minimal syntax, or made explicit to optimize an experience for reading and maintenance.

Elastic Syntax offers a way to transition a shell to a scripting language incrementally, feature by feature, as needed.
Typed Programming Language / Real Argument Handling

Powershell is a typed programming language, and allows passing arrays explicitly to functions. Parameters can be specified by name, and the number of arguments and there types is easily controllable. No flattening and re-parsing ever need be done. Powershell can also ensure that calls make sense by type checking and argument checking.
The Pipeline

Powershell’s most famous trick is its pipeline. Unlike bash which creates separate processes and passes data via argv, stdin and stdout, Powershell runs all its cmdlets inside the same process, and passes Objects between cmdlets. This is interesting because the objects come with the ability to invoke their methods and access there members. This makes tricks like filtering not require regular expressions, but just like accessing members by name.

Get-Process | Where-Object Name -like 'cmd.exe' | ForEach $_.kill()

Note, this isn’t the most efficient way to do this, just a demonstration of the pipeline passing objects. You can actually run arbitrary blocks of code on each process including multiple cmdlets, to ultimately determine what happens to the process. Because the process is an object, we don’t lose information by having to parse some specific bit of it out and act on it in a filter, we are free to use any of its member’s and methods no matter where in the pipeline it appears.
Real Modularity

Powershell also has a namespace and module system.
Good Command Names

Since Microsoft was intentionally top down designing an ecosystem with retrospect on the past, they got to design the command names consistently. Although this is imperfectly done, Microsoft has chosen the convention of verb-noun for the command Names. There are commands like Get-Process, which allow you to predict commands like Get-Service exist, or from the two predict and the existence of Start-Service predict the existence of Start-Process probably exists. Microsoft has tried to a produce a convention for naming cmdlets that is intuitive.
Good Process Model

Microsoft Windows doesn’t have a process model that allows processes to have typed arguments either. So, cmdlets accepting typed arguments is an indication that something else is going on. Effectively, all Powershell commands are builtins, that run in the same process. This is quite unlike bash, which runs plenty of sub-processes. Powershell’s cmdlet model is far more efficient. Efficiency isn’t terribly important for scripts, many of which will be dependent on Network IO anyway, but is nice to have.
Powershell is Terrible
Written in .NET/MONO

Powershell is written in Microsoft’s .NET / Mono . Mono/.NET is a competitor to java, a byte code driven VM provides the runtime on which the code runs. Microsoft calls this “managed” code, because garbage collection and memory management are done by the runtime. Most libraries and applications on *nix are written in C , and bindings for .NET/MONO don’t exist. .NET ,however, one of many visions Microsoft ha(s|d) for the future - Alternates include Modern Apps, and many Windows administration primitives are available via .NET. Although performance of scripts probably doesn’t matter, at least at the level between native and runtime code, MONO adds yet another layer of abstraction, if only to support Windows Libraries and APIs.

Powershell allows accessing .NET libraries via script, and this is amazingly flexible on the Windows platform, and also a security risk, because a user can effectively write new code. If you don’t install compilers on production machines because you don’t want attackers compiling code, the Powershell isn’t for you. Also, if your on Windows, you don’t have a choice.
Shitty Command Names

Although syntax, including for calling functions, is inspired by bash, Microsoft happily dumped the sacred convention that command names be lowercase and produced a bunch of hyphenated monstrosity names. These names can be abbreviated via aliases, Get-ChildItem -> gci which is much more respectable. Further, Does the name Get-ChildItem actually allow you to predict the the output will list directory contents? To be sure, it can do more, but a gci call is something like a replacement for ls.
Microsoft Sucks

Microsoft Sucks. Microsoft is a giant corporation with a rich history of wishing evil to open source if not causing it. Microsoft is a direct competitor to *nix, and the command line is really far to important for Microsoft to own. Its true that Powershell is open source, and it could be forked, allowing Microsoft to define the shell and extensions to favour its own platform is an undesirable place for Linux to be.

Powershell is also bad for privacy - by default Powershell collects Telemetry. Powershell requires you to explicitly opt out of Powershell each time you run it by creating an environment variable POWERSHELL_TELEMETRY_OPTOUT and setting the value to 1. To be sure this can be automated, and I am sure this is trivially patched out of the source, but, do you really want your shell collecting Telemetry? Telemetry isn’t why I choose Linux. Also, telemetry exists on various open source Projects including Firefox.
Python Sucks

An obvious candidate for administration and automation on Linux is Python. Python is so well known, that no deep analysis is required. Plenty of bindings exist, but python isn’t a great shell due to the Syntax. Its simply not designed as a shell. Python doesn’t have a minimal syntax, or pipeline like data-flow for passing data. Even invoking every single command as ls() would get tiring. The syntax of python just isn’t suited to use as a shell.

Aquiring data isn’t like bash cat | , but requires more subtely, and handles more complexity. Not suitable to a shell.

Further, python isn’t trivial to get started with like the shell. After learning interactive use of a variety of Unix commands, ls and its arguments doesn’t transfer into python where os.listdir() or os.walk() might be what you want depending on your needs.

Xonsh is an interesting attempt to merge Python and the shell by effectively just having the parser decide if you meant python or shell depending on your context. IMO, a true shell/scripting hybrid requires special integration with the FS akin to globbing and ease of running sub-processes, and collecting an manipulating the data. Ironically, shell utilities being able to output JSON, might fix this, parsing this into python dictionaries might make working with native utilities easy and provide an experience similiar to powershell objects (sem methods).

Powershell’s and Bash are far more suitable shells than python at this point but the future is open.
An Illustration of the Problems.

Here is a simple thought experiment to illustrate these conclusions a bit. While viewing rendition infosecs jobs page, I observed a challenge to sort IP addresses numerically from a file in ascending order for each octet.

To do this in bash, we just use sort -V . That actually feels like cheating because we are supposed to write the program. Well, sort -n wont exactly work. We can sort on sub-fields with sort -k but we can only sort on a single sub-field. The answer I ultimately selected was encoding the IP Addresses as they actually are, 4 byte numbers sorting like normal numbers than decoding. I did this with the filter pipeline model.

This solution is somewhat unsatisfactory because if we are going to use sort, why not just use sort -V ? Well, writing the sort ourselves is non-trivial. I wrote a function to compare to address and return -1,0,1 similar to strcmp(), which effectively orders IP addresses. In any sane programming language, this is actually enough, and plugging this into a sort algorithm will produce a result. Bash has no mechanism to do that. I could write a merge sort implementation in bash, but remember, that bash doesn’t have any real way to pass arrays between functions. What a pain in the ass.

Bash does make getting the data, and operating on it in a pipeline easy. It doesn’t provide many data structures or algorithms.

Meanwhile, Powershell will allow all of the same approaches as bash, and allow me to plug my comparison function into a sort algorithm. And if I want to define a sort algorithm, it will let me pass arrays between functions.

Python is a full featured programming language that makes acquiring the data relatively difficult for a one off, I have to open the file and read the data. I will have to parse the data and explicitly convert to numbers and some kind of representation, then I can use a built in sort algorithm or define my own in a trivial way, because python supports passing data structures between functions. The python file will be longer and have more characters.

The bash solution is only acceptable because I was able to make my algorithm work in a pipelined data flow model.
Conclusion

Bash is severally limited as a scripting language but useful as a shell. Python is extremely useful as a scripting language, but impossible to use sanely as a shell. Powershell is an attempt to blend both but dumps Unix culture, C, and hands control of the shell to Microsoft. Part of why powershell does so well in this consideration is that it was designed as a modern shell and scripting language comobo. Asking the question this way is inheritantly choosing to exclude python. Perhaps a seperate scripting language and shell are fine.

Comments

Popular posts from this blog

How to hack wifi in Windows 7/8/8.1/10 without any software | using with cmd

How to Hack Wifi password using cmd Hello Friends, In this article we will share some tricks that can help you to hack wifi password using cmd. Youcan experiment these trick with your neighbors or friends. It’s not necessarily that this trick will work with every wifi because of upgraded hardware. But you can still try this crack with wifi having old modems or routers. 1: WEP: Wired Equivalent Privacy (WEP) is one of the widely used security key in wifi devices. It is also the oldest and most popular key and was added in 1999. WEP uses 128 bit and 256-bit encryption. With the help of this tutorial, you can easily get into 128-bit encryption and Hack WiFi password using CMD. 2: WAP and WAP2: Wi-Fi Protected Access is an another version of WiFi encryption and was first used in 2003. It uses the 256-bit encryption model and is tough to hack. WAP2 is an updated version of WAP and was introduced in 2006. Since then it has replaced WAP and is now been used mostly in offices and colleges w

സുമതിയെ കൊന്ന വളവ് | The real Story of Sumathi valavu

സുമതി വളവ്.. മൈലമൂട് സുമതിയെ കൊന്ന വളവ് എന്ന് കേട്ടാല്‍ കേള്‍ക്കുന്നവരുടെ മനസ്സ് അറിയാതൊന്ന് കിടുങ്ങുന്നകാലമുണ്ടായിരുന്നു .അത്ര കണ്ട് ഭയമാണ് ഈ സ്ഥലത്തെക്കുറിച്ച് നാട്ടുകാരുടെ മനസ്സില്‍ഒരു കാലത്ത് ഉണ്ടായിരുന്നത്. അറുപത് വര്‍ഷം മുമ്പ് കൊല ചെയ്ത സുമതിയെന്ന ഗര്‍ഭിണിയായ യുവതിയുടെ ആത്മാവ് ഗതി കിട്ടാതെ ഇവിടെ അലഞ്ഞ് തിരിഞ്ഞ് നടക്കുന്നുവെന്ന വിശ്വാസമാണ് ഭയത്തിന് കാരണം. തിരുവനന്തപുരം ജില്ലയില്‍ കല്ലറ പാലോട് റോഡില്‍ മൈലമൂട്ടില്‍ നിന്നും അര കിലോമീറ്റര്‍ ദൂരെ വനത്തിനുള്ളിലെ കൊടും വളവാണ് സുമതിയെ കൊന്ന വളവ് എന്ന സ്ഥലം. ഇവിടെ വച്ചാണ് സുമതി കൊല്ലപ്പെട്ടത്. വനപ്രദേശമായതിനാല്‍ സന്ധ്യ മയങ്ങുമ്പോള്‍ തന്നെ ഇരുട്ടിലാകുന്ന സ്ഥലമാണിത്. ഇടതിങ്ങി വളര്‍ന്ന് നില്‍ക്കുന്ന മരങ്ങളുള്ള റോഡില്‍ ഒരുവശം വലിയ ഗര്‍ത്തമാണ്.ഒപ്പം കാടിന്റെ വന്യമായ വിജനതയും. ഇതിനൊപ്പം പൊടിപ്പും തൊങ്ങലും വച്ച് പ്രചരിയ്കുന്ന കഥകള്‍ കൂടിയാകുമ്പോള്‍ എത്ര ധൈര്യശാലിയായാലും ഈ സ്ഥലത്തെത്തുമ്പോള്‍ സുമതിയുടെ പ്രേതത്തെക്കുറിച്ച് അറിയാതെയെങ്കിലും ഓര്‍ത്ത് പോകും.പ്രത്യേകിച്ചും രാത്രി കാലങ്ങളില്‍. സുമതി മരിച്ചിട്ട് ഇപ്പോള്‍ അറുപത് വര്‍ഷം കഴിഞ്ഞു. എന്നിട്ടു

A Beginner’s Guide to Getting Started with Bitcoin

A man looks for Bitcoin Oasis If you have heard about blockchain or cryptocurrency, then the term that initially comes to mind is Bitcoin . Launched 12 years ago, it was the late 2017 bull run that created a media frenzy that propelled Bitcoin into the mainstream and our modern day lexicon. Often labeled as the “original” cryptocurrency, Bitcoin has been the catalyst (directly and/or indirectly) behind many new innovations in the blockchain and digital asset space, most notably Ethereum and Monero . Shortly after the late 2017 bull run lost its steam, interest in these new technologies started to fade ― but here we are in 2021 with Bitcoin having risen like a phoenix from the ashes. As you would assume, an appetite for the blockchain and digital asset space has returned and now it is more important than ever that we understand what exactly is behind this unique asset, Bitcoin. This article is meant to be a guide for individuals who are new to cryptocurren