Criteo Coding style for C#
==========================
Why a coding style?
-------------------
### Foreword
Long story short: consistent code, easier reading, less merge conflicts, happy us.
This coding style uses [RFC2119](http://www.ietf.org/rfc/rfc2119.txt) definitions for requirement levels.
### Emphasis
Given our recent move to Gerrit we'll review code more often and receive external contribution to our projects, meaning we'll have to read more and more C# code outside Visual Studio with no access to code analysis features or even syntax highlighting.
This perfectly points out what this coding style tries to evangelize: code readability must always be given priority over ease of writing. Some direct consequences are:
- [Native conventions](http://msdn.microsoft.com/en-us/library/ms184412.aspx) of the language must be preferred over any other one when they exist ;
- The more explicit you are the less room you leave for possibly wrong interpretation ;
- There should not be two ways to write the exact same thing.
All the principles from the [Zen of Python](http://www.python.org/dev/peps/pep-0020/) also happen to have concurring rationale. Cast a glance there and keep in mind they're behind almost all the rules stated in this document.
Blank characters
----------------
This section describes how and where blank characters may appear in source file ; no blank character shoud be used outside of the cases below.
### Whitespaces
Whitespaces are required on both side of binary and ternary operators with the exception of the comma and semicolon operators which require a whitespace on their right side only. Unary operators, braces, brackets and parenthesis don't need whitespaces:
padding = (width[1] - width[0]) * scale / 2f;
offset = margin * ++i;
A whitespace must be used after keywords which are followed by an opening parenthesis:
if (x > 5)
y = default (T);
Whitespaces are also used to satisfy alignment rules (see "declaration" section for details). No other unnecessary whitespace should appear anywhere outside from comments and trailing whitespaces must be removed.
Character ASCII 32 must be used to write whitespaces, other blank characters such as tablulations shall be avoided.
### Indents levels
Indent level defines how many whitespaces must appear before the first non-blank character of a line. Four whitespaces must be used per indent level, and initial indent level is zero.
Several constructs are followed by a block of code (which is a list of statements) which must be written with an indent level increased by one:
- Statement after some keywords (`if`, `for`, `while`, `do`, `using`, `lock`, etc.) ;
- Namespace and type declarations ;
- Body of a delegate, method, property getter/setter, event add/remove.
When a block of code is enclosed between braces, each of these braces must be on its own line and use it's parent indent level ; statements within the braces use an indent level increased by one:
class MyClass
{
bool MyProperty
{
get
{
return 3;
}
}
char MyMethod(int parameter)
{
if (parameter != 0)
{
for (int i = 0; i < 10; ++i)
Console.WriteLine("0" + parameter);
for (int i = 10; i < 99; ++i)
Console.WriteLine(parameter);
}
}
}
The `switch` keyword uses a specific indent pattern: an indent level is used for the `case` and `default` labels, and another level is required for code inside a label including the end-of-block keyword (`break`, `return`, `throw`, etc.) included. No braces must be used even when label bodies are multiline blocks of code:
switch (value)
{
case 1:
this.DoSomething();
break;
case 2:
Console.WriteLine("will use case 1");
goto case 1;
default:
throw new ArgumentOutOfRangeException("value");
}
Alignment rules sometimes specify that code must be aligned "to the next indent", meaning that text must be padded on the left with whitespaces so that next non-blank character starts at a column index multiple of four.
### Line breaks
Each statement should fit in one line and must be followed by a line break, meaning that a single line cannot contain more than one statement. The only exception to this rule is the short form of anonymous delegate declaration (see "delegates" section for details) which can be used on a single line if it contains no braces nor `return` keyword:
callback = (x) => x * 2;
Constructor calls, method calls and object initializers can be split across multiple lines by adding an extra line break before both opening and closing brace/parenthesis and using a new level of indent for arguments. You can add line breaks after any comma, but number of arguments per line must be consistent across the entire statement and arguments having the same position must be aligned to the next indent:
keywords = new Dictionary<string, Parser>
{
{"echo", (p) => p.ParseKeywordEcho()},
{"for", (p) => p.ParseKeywordFor()},
{"if", (p) => p.ParseKeywordIf()}
};
result = ComplexCall
(
x.InnerMethodCall(1, 2),
y.InnerMethodCall(3, 4)
);
pixels = new byte[]
{
0x54, 0xF8, 0xC9, 0xCD,
0x16, 0xD, 0x64, 0x05,
0x6D, 0xAB, 0x7, 0xCA
};
You can also split an expression after any binary operator, in that case you must align the right hand side part with the left hand side one:
while (reading && ((last >= 'A' && last <= 'Z') ||
(last >= 'a' && last <= 'z')))
last = this.Read();
Line breaks must use Windows end of line characters (ASCII 13 followed by ASCII 10).
### Empty lines
Empty lines must appear between two declaration statements if at least one of them spreads over multiple lines:
class Ship
{
private float x = 0;
private float y = 0;
public void Move(float speed)
{
this.x += Math.Cos(this.angle) * speed;
this.y += Math.Sin(this.angle) * speed;
}
public void Turn(float angle)
{
this.angle += angle;
}
}
An empty line is required before jumping statements `break`, `case`, `continue`, `goto`, `label`, `return` and `throw` unless they are the only statement in a code block:
if (states.TryGetValue(key, out state))
return true; // No empty line required
state = State.Unknown;
return false; // Empty line before jumping statement
All blocks of code (both single and multiline) must be followed by an empty line except for inner blocks closed immediately before their parent:
if (a)
{
if (b)
x(); // Empty line after this single line block
if (c)
{
// No empty line after this block, its parent is also closing
y();
z();
}
}
Rationale: empty lines around blocks and jumps highlights flow control instructions.
### Comments
Comments can be placed either on the line before a statement or on the same line after it, separated by at least one whitespace character. They doesn't count as actual lines of code and thus have no effect on alignment or other spacing rules which must be applied as if comment was not present:
if (exit) // Comment can appear at the end of a statement
break;
// Empty line above is mandatory regardless of this comment's presence
n = this.Next();
Both C (`//`) and C++ (`/* */`) comment styles can be used.
Symbols
-------
Elements of the C# language having a user-defined name are symbols and must comply with naming and casing rules according to their nature.
### Naming
Symbol names must use few non-ambiguous words and avoid unclear abbreviations. Names containing either more than 3 words or over-generic terms (common, helper, misc, utility, ...) should be avoided.
Variable names should help understanding their intended use and not what kind of data they contain, as this information can already be deduced from their type:
URL redirect; // OK: that's an URL which will be used for redirection
URL url; // Not OK: name doesn't provide any new information
Compound names must be formed by prepending adjectives or other descriptors to base name, meaning that `EventFailure` is a failure about some event, not an event indicating a failure.
Well-known acronyms can also be used in symbol names and must be written either entirely in lowercase or entirely in uppercase depending on whether the first letter is lower or uppercase:
public void HTTPRequest(URL target)
{
string dnsHostName;
bool readAsHTML;
// ...
}
Symbols used in a small scope (less than 10 lines, e.g. anonymous function construction) can use a single letter name. Symbol names musn't use underscores or any other special character as a prefix or suffix.
Rationale: special characters makes misreading easier, and native conventions exist for private or instance symbols or similar rules found in other coding styles.
### Namespaces
There should always be one namespace per file only. Namespace components must be names of the folders (from root to current) containing code file, converted to [PascalCase](http://c2.com/cgi/wiki?PascalCase):
// This file must be stored in folder /mycompany/myproject/
namespace MyCompany.MyProject
{
}
### Types
Only one public type (interface, class, struct, enum or delegate) must be defined per file, and its name must be the name of the file containing the code converted to [PascalCase](http://c2.com/cgi/wiki?PascalCase).
// This class must be found in a file named "myclass.cs"
public class MyClass
{
}
Some types names must comply with additional requirements:
- Interface names must start with a capital `I` and should be adjectives denoting an ability (e.g. `IEnumerable`) ;
- Event names must end with "Event" (e.g. `ServerStartEvent`).
When inheriting from a class or struct, child type's name must end with it's parent's name, e.g. `HTTPWebProxy` inherits from `WebProxy`, which inherits from `Proxy`.
### Imports
All `using` statements must appear before any namespace declaration in a file and must be sorted alphabetically, except for namespaces starting with `System` which must always appear first. Import statements with different top-level namespace must be separated with an empty line:
using System;
using System.Threading;
using Criteo.Arbitration;
using Criteo.Render;
Unused import statements must be removed from source files. Namespaces must not be aliased (construct `using X = Y` is not allowed).
Declarations
------------
Declaration statements are quite separate from others as they provide information about a program layout before actual execution can be described. This coding style intend to keep this distinction apparent.
### Type qualifiers
Qualifiers must always appear before type and be ordered according to their precedence. Qualifiers ordered by decreasing precedence are:
- Visibility: `public`, `protected` then `private`
- Membership: `static`, `abstract`, `virtual`, `override`, `new` then `sealed`
- Memory model: `const`, `readonly` then `volatile`
- Execution: `unsafe`
- Asynchronism: `async`
Qualifier precedence is also used to sort declarations: two declarations must be ordered according to the precedence of their first qualifier, or second if first are tied, etc.:
public readonly static DateTime Epoch; // Not OK, "static" has a higher precedence than "readonly" and should appear first
private override unsafe bool TryParse(string input, out T value); // OK
Type-specific rules define how multiple declarations having the same set of qualifiers must be ordered.
### Interfaces, classes and structs
Interfaces, classes and structs must use [PascalCase](http://c2.com/cgi/wiki?PascalCase) for their name and declare their members in following order:
- Constants
- Events
- Properties
- Fields
- Constructors
- Methods
- Types
Multiple members of the same type must be ordered by their qualifiers (see "qualifiers" section for details), then by the order defined by type-specific rules (see "methods and properties" section), then by alphabetical order.
Types must not be declared across multiple files (meaning the `partial` keyword is not allowed) unless all those files but one are generated by an automated tool and are not meant to be modified manually nor reviewed. In that case, source files used for code generation must be bundled with project.
Public sub-type should be avoided unless trivial (delegate or small pure entity).
### Constants
Constants must use capitalized [snake_case](http://en.wikipedia.org/wiki/Snake_case), e.g. `MAX_CLIENTS`. Constant must be used as replacements for `static readonly` declarations whenever possible (boolean values, integer numbers, floating point numbers, characters and strings).
### Fields and events
Private fields and events must use [camelCase](http://en.wikipedia.org/wiki/CamelCase), public ones must use [PascalCase](http://c2.com/cgi/wiki?PascalCase).
Private events name must always start with "on", public ones with "On", and the rest of the name should match the name of their event type less the `Event` word (e.g. `OnServerStart` for an event of type `ServerStartEvent`).
Fields and events may be assigned a default value if parent type doesn't have any explicit constructor. If a constructor is defined, all default values must be assigned there.
### Methods and properties
Methods and properties must use [PascalCase](http://c2.com/cgi/wiki?PascalCase) and be sorted by ascending alphabetical order in their parent type. Properties should use explicit attributes in concrete classes, and must use them if least one accessor is non public:
public string Name
{
get;
private set; // Not OK, explicit attribute must be used
}
Overridden methods having the same set of qualifiers must be ordered by decreasing number of parameters, or by alphabetical order of their parameter types (from first to last) if they have the same number of parameters:
public abstract void Browse(string target, bool followRedirect);
public abstract void Browse(string target); // Second position because one parameter only
public abstract void Browse(URL target); // Third position because "string" < "URL"
Overrides must be used as a replacement for parameters with default values, as later ones affect caller code and may cause more API breaks. If a method override accepts the same parameters as original method minus one parameter, it must be the last one of original method.
### Constructors
Constructors are methods, and as such they must follow the same rules than regular methods with one additional constraint ; constructor parameters used to directly initialize object fields must share the same name than those fields, and assignment statements must be sorted by field name using ascending alphabetical order:
class MyImage
{
private uint height;
private uint width;
public MyImage(uint width, uint height)
{
this.height = height;
this.width = width;
}
}
### Enumerations
Enum values should be alphabetically sorted [PascalCase](http://c2.com/cgi/wiki?PascalCase) names and must not repeat enum type name. No explicit mapping to any integer type should be specified unless required by external constraints such as serialization:
public enum RequestStatus
{
Cancelled,
Completed,
Failed,
Timeout
}
Rationale: enumerations types must be used to represent one possibility among a set (or several possibilities when `Flag` attribute is used). The integer type behind them is an implementation detail which should not interfere with design.
### Variables
Variable declaration should be kept visually separated from implementation by declaring local variables before any other statement in method and property bodies. Non-constant assignments must not be mixed with variable declarations. Only one variable can be declared per statement, lines should be ordered by variable name, names should use [camelCase](http://en.wikipedia.org/wiki/CamelCase) and be aligned to the next indent:
string name;
int rank;
float score;
name = this.PromptName(); // Implementation separated from declaration
rank = 0; // Could be assigned at declaration, but here for consistency reasons
Variable declaration mixed with implementation statements is allowed with `fixed`, `foreach` and `using` keywords, or when required by scope resolution rules (e.g. closure captures):
using (TextWriter backup = new StreamWriter(file))
{
foreach (IReader reader in readers) // OK, declaration allowed in a foreach statement
{
IExtractor closure = reader; // OK, required for closure to behave as expected
this.Invoke(() =>
{
lock (backup)
backup.Write(closure.Read());
});
}
}
Declarations with `var` keyword can only be used when type of assigned expression is clearly visible. Type can be forced to appear with help of the `as` keyword as long as you are not downcasting it (only upcasting is allowed):
var result1 = this.Compute(x, y); // Not OK, type is not visible
var result2 = this.GetName(x) + "_suffix"; // OK, it's a string
var result3 = this.Render(x) as Page; // OK if return type of method is a Page
### Generics
Generic parameters should be capitalized single letters, preferably starting with letter `T`. Capitalized single words can also be use as generic parameters to let them clearly appear in code.
Implementations
---------------
Non-declarative statements are too complex and a programming guide would be required to properly how they should be written. This coding style only mentions some general high-level rules.
### Code blocks
Code blocks containing only one statement must not use braces unless they are mandatory (e.g. method body):
for (int i = 0; i < 10; ++i) // Several statements, mandatory braces
{
if (random.NextDouble() < 0.5)
break; // One statement, no braces
}
### Operators
Order binary operators operands so that the one least likely to change (or the one which changes the less often) is on the left hand side of the operator:
i = x + this.offset; // Assuming "this.offset" changes less often that x
if (i < 5)
return;
Prefer the use of `++i` over `i++`, and the later one over `i += 1` when any of these constructs can be used without changing the result of an expression.
Result of assignments must not be used as value in an expression:
// Not OK, assignment used as a value
if ((i = stream.Read() == -1)
break;
// OK
i = stream.Read();
if (i == -1)
break;
Rationale: side effect in assignments are hard to read, risk of letting them unnoticed is too high for only one line of code saved.
### References
To avoid any confusion between variables, static members and instance members, references must let their belonging appear:
- Instance members and methods must be prefixed by `this`
- Static members and methods must be prefixed by their type
Required namespaces must be imported so types don't need to be prefixed by their namespace, except for ambiguous resolution. In that case, the shortest possible namespace path must be used to solve the ambiguity:
using System; // Imported to keep reference to "System.Drawing.Point" as short as possible (see below)
namespace MyApp
{
class Point
{
public static readonly Origin = new Point(0, 0);
private int x;
private int y;
// Prefix used to solve ambiguity between System.Drawing.Point and MyApp.Point
public Point(Drawing.Point point)
{
this.x = point.X;
this.y = point.Y;
}
public void Reset()
{
// Mandatory prefixes for "Set" instance method and "Origin" static member
this.Set(Point.Origin);
}
public void Set(Point other)
{
this.x = other.x; // Mandatory prefix for instance field
this.y = other.y; // Same here
}
}
}
Rationale: explicit references save reviewer from having to infer symbol resolution.
### Delegates
Short form must always be used when creating delegate (anonymous functions), and arguments must always be enclosed in a pair of parenthesis even when there is only one argument. Argument types may be omitted when possible:
// Mandatory parenthesis, no need for argument types
callback = (i) => i * 2;
// Argument types required because of "out" modifier
parser = (string input, out float number) => float.TryParse(input, NumberStyles.Float, CultureInfo.InvariantCulture, out number);
Remember that one-liner delegates are a special case (see "line breaks" section), multi-line delegates must follow usual code block indent rules:
read = (name) =>
{
Field field;
if (this.fields.TryGetValue(name, out field))
return field;
return Field.Empty;
};
Guidelines
----------
Programming guidelines is a topic on its own which is beyond the scope of a coding style, but some important rules are summarized here for the sake of completeness.
### Casts
Use of type casts should be avoided whenever possible. One legitimate use of casts is to work around issues with untyped APIs (e.g. legacy .NET 1.0 API).
Target type should always be known so you don't have to test whether an instance is of one type or another, meaning keyword `is` should never be used. For the same reason, prefer `(Type)value` form over `value as Type`.
Rationale: casts can be avoided by applying good object-oriented practices and using generics, thus often denote a poor design. Casting errors are bugs and having the CLR throw an exception is more safe than keeping the error silent.
### Exceptions
Exceptions should be used sparingly as they are very easy to abuse and have the addtionnal flaw of being very slow in C#. Predictable runtime failures, invalid user inputs or any other outside event must not throw any exception. Exceptions are acceptable if internal components of the application are in a state that completely prevents it from running. Proper error handling (e.g. error status return, error events, logs, etc.) must be applied in all other cases.
When using APIs designed to throw exceptions in non-exceptional situations, they must either be intercepted and replaced by explicit error propagation or not be caught at all.