thumb|alt=main( ) { printf("hello, world") }|Simple [[C (programming language)|C source code for a "Hello world" program. Taken from the seminal book The C Programming Language, it originates from Brian Kernighan in the Bell Laboratories in 1974.<!-- See http://cm.bell-labs.com/cm/cs/who/dmr/ctut.pdf for original.-->]]

In computing, source code, or simply code or source, is human readable plain text that can eventually result in controlling the behavior of a computer. In order to control a computer, it must be processed by a computer program either executed directly via an interpreter or translated into a more computer-consumable form such as via a compiler. Sometimes, code is compiled directly to machine code so that it can be run in the native language of the computer without further processing. Many modern environments, though, involve compiling to an intermediate representation such as bytecode that can either run via an interpreter or be compiled on-demand to machine code via just-in-time compilation.

Background

The first programmable computers, which appeared at the end of the 1940s, were programmed in machine language (simple instructions that could be directly executed by the processor). Machine language was difficult to debug and was not portable between different computer systems. Initially, hardware resources were scarce and expensive, while human resources were cheaper. As programs grew more complex, programmer productivity became a bottleneck. This led to the introduction of high-level programming languages such as Fortran in the mid-1950s. These languages abstracted away the details of the hardware, and were designed to express algorithms that could be understood more easily by humans. As instructions distinct from the underlying computer hardware, software is therefore relatively recent, dating to these early high-level programming languages such as Fortran, Lisp, and Cobol. The invention of high-level programming languages was simultaneous with the compilers needed to translate the source code automatically into machine code that can be directly executed on the computer hardware.

Source code is the form of code that is modified directly by humans, typically in a high-level programming language. Object code can be directly executed by the machine and is generated automatically from the source code, often via an intermediate step, assembly language. While object code will only work on a specific platform, source code can be ported to a different machine and recompiled there. For the same source code, object code can vary significantly—not only based on the machine for which it is compiled, but also based on performance optimization from the compiler.

Organization

Most programs do not contain all the resources needed to run them and rely on external libraries. Part of the compiler's function is to link these files in such a way that the program can be executed by the hardware.

thumb|right|A more complex [[Java (programming language)|Java source code example. Written in object-oriented programming style, it demonstrates boilerplate code. With prologue comments indicated in red, inline comments indicated in green, and program statements indicated in blue.]]

Software developers often use configuration management to track changes to source code files (version control). The configuration management system also keeps track of which object code file corresponds to which version of the source code file.

Purposes

Estimation

The number of source lines of code (SLOC) is often used as a metric when evaluating the productivity of computer programmers, the economic value of a code base, effort estimation for projects in development, and the ongoing cost of software maintenance after release.

Communication

Source code is also used to communicate algorithms between parties, e.g., code snippets online or in books.

Computer programmers can find it helpful to review extant source code to learn about programming techniques.

Source code often contains comments—blocks of text marked for the compiler to ignore. This content is not part of the program logic, but is instead intended to help readers understand the program.

Companies often keep the source code confidential in order to hide algorithms considered a trade secret. Proprietary, secret source code and algorithms are widely used for sensitive government applications such as criminal justice, which results in black box behavior with a lack of transparency into the algorithm's methodology. The result is avoidance of public scrutiny of issues such as bias.

Modification

Access to the source code (not just the object code) is essential to modifying it. Understanding extant code is necessary to understand how it works and before modifying it. The rate of understanding depends both on the code base as well as the skill of the programmer. Experienced programmers have an easier time understanding what the code does at a high level. Software visualization is sometimes used to speed up this process.

Many software programmers use an integrated development environment (IDE) to improve their productivity. IDEs typically have several features built in, including a source-code editor that can alert the programmer to common errors. Modification often includes code refactoring (improving structure without changing function) and restructuring (improving structure and function simultaneously). Nearly every change to code introduces new bugs or unexpected ripple effects, which require another round of fixes. In 1974, the US Commission on New Technological Uses of Copyrighted Works (CONTU) decided that "computer programs, to the extent that they embody an author's original creation, are proper subject matter of copyright".

Proprietary software is rarely distributed as source code. Although the term open-source software literally refers to public access to the source code, open-source software has additional requirements: free redistribution, permission to modify the source code and release derivative works under the same license, and nondiscrimination between different uses—including commercial use. The free reusability of open-source software can speed up development.

See also

  • Bytecode
  • Code as data
  • Coding conventions
  • Free software
  • Legacy code
  • Machine code
  • Markup language
  • Obfuscated code
  • Object code
  • Open-source software
  • Package manager
  • Programming language
  • Source code repository
  • Syntax highlighting
  • Visual programming language

References

Sources

<!-- Hidden categories below -->