OCaml

OCaml
Paradigm	Multi-paradigm: functional, imperative, modular,object-oriented
Family	ML
Designed by	Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy, Ascánder Suárez
Developer	INRIA
First appeared	1996; 25 years ago
Stable release	4.12.0 / 24 February 2021; 3 months ago
Typing discipline	Inferred, static, strong, structural
Implementation language	OCaml, C
Platform	IA-32, x86-64, Power, SPARC, ARM 32-64
OS	Cross-platform: Unix, macOS, Windows
License	LGPLv2.1
Filename extensions	.ml, .mli
Website	ocaml.org
Influenced by
	C, Caml, Modula-3, Pascal, Standard ML
Influenced
	ATS, Coq, Elm, F#, F*, Haxe, Opa, Rust, Scala
	Objective Caml at Wikibooks;

OCaml (/oʊˈkæməl/ oh-KAM-əl, formerly Objective Caml) is a general-purpose, multi-paradigm programming language which extends the Caml dialect of ML with object-oriented features. OCaml was created in 1996 by Xavier Leroy, Jérôme Vouillon, Damien Doligez, Didier Rémy, Ascánder Suárez, and others.

The OCaml toolchain includes an interactive top-level interpreter, a bytecode compiler, an optimizing native code compiler, a reversible debugger, and a package manager (OPAM). OCaml was initially developed in the context of automated theorem proving, and has an outsize presence in static analysis and formal methods software. Beyond these areas, it has found serious use in systems programming, web development, and financial engineering, among other application domains.

The acronym CAML originally stood for Categorical Abstract Machine Language, but OCaml omits this abstract machine.^[2] OCaml is a free and open-source software project managed and principally maintained by the French Institute for Research in Computer Science and Automation (INRIA). In the early 2000s, elements from OCaml were adopted by many languages, notably F# and Scala.

Philosophy

ML-derived languages are best known for their static type systems and type-inferring compilers. OCaml unifies functional, imperative, and object-oriented programming under an ML-like type system. Thus, programmers need not be highly familiar with the pure functional language paradigm to use OCaml.

By requiring the programmer to work within the constraints of its static type system, OCaml eliminates many of the type-related runtime problems associated with dynamically typed languages. Also, OCaml's type-inferring compiler greatly reduces the need for the manual type annotations that are required in most statically typed languages. For example, the data type of variables and the signature of functions usually need not be declared explicitly, as they do in languages like Java and C#, because they can be inferred from the operators and other functions that are applied to the variables and other values in the code. Effective use of OCaml's type system can require some sophistication on the part of a programmer, but this discipline is rewarded with reliable, high-performance software.

OCaml is perhaps most distinguished from other languages with origins in academia by its emphasis on performance. Its static type system prevents runtime type mismatches and thus obviates runtime type and safety checks that burden the performance of dynamically typed languages, while still guaranteeing runtime safety, except when array bounds checking is turned off or when some type-unsafe features like serialization are used. These are rare enough that avoiding them is quite possible in practice.

Aside from type-checking overhead, functional programming languages are, in general, challenging to compile to efficient machine language code, due to issues such as the funarg problem. Along with standard loop, register, and instruction optimizations, OCaml's optimizing compiler employs static program analysis methods to optimize value boxing and closure allocation, helping to maximize the performance of the resulting code even if it makes extensive use of functional programming constructs.

Xavier Leroy has stated that "OCaml delivers at least 50% of the performance of a decent C compiler",^[3] although a direct comparison is impossible. Some functions in the OCaml standard library are implemented with faster algorithms than equivalent functions in the standard libraries of other languages. For example, the implementation of set union in the OCaml standard library in theory is asymptotically faster than the equivalent function in the standard libraries of imperative languages (e.g., C++, Java) because the OCaml implementation exploits the immutability of sets to reuse parts of input sets in the output (see persistent data structure).

Features

OCaml features a static type system, type inference, parametric polymorphism, tail recursion, pattern matching, first class lexical closures, functors (parametric modules), exception handling, and incremental generational automatic garbage collection.

OCaml is notable for extending ML-style type inference to an object system in a general-purpose language. This permits structural subtyping, where object types are compatible if their method signatures are compatible, regardless of their declared inheritance (an unusual feature in statically typed languages).

A foreign function interface for linking to C primitives is provided, including language support for efficient numerical arrays in formats compatible with both C and Fortran. OCaml also supports creating libraries of OCaml functions that can be linked to a main program in C, so that an OCaml library can be distributed to C programmers who have no knowledge or installation of OCaml.

The OCaml distribution contains:

Lexical analysis and parsing tools called ocamllex and ocamlyacc
Debugger that supports stepping backwards to investigate errors
Documentation generator
Profiler – to measure performance
Many general-purpose libraries

The native code compiler is available for many platforms, including Unix, Microsoft Windows, and Apple macOS. Portability is achieved through native code generation support for major architectures: IA-32, X86-64 (AMD64), Power, SPARC, ARM, and ARM64.^[4]

OCaml bytecode and native code programs can be written in a multithreaded style, with preemptive context switching. However, because the garbage collector of the INRIA OCaml system (which is the only currently available full implementation of the language) is not designed for concurrency, symmetric multiprocessing is unsupported.^[5] OCaml threads in the same process execute by time sharing only. There are however several libraries for distributed computing such as Functory and ocamlnet/Plasma.

Development environment

Since 2011, many new tools and libraries have been contributed to the OCaml development environment:

Development tools
- opam is a package manager for OCaml, developed by OCamlPro.
- Merlin provides IDE-like functionality for multiple editors, including type throwback, go-to-definition, and auto-completion.
- Dune is a composable build-system for OCaml.
- OCamlformat is an auto-formatter for OCaml.
Web sites:
- OCaml.org is the primary site for the language.
- discuss.ocaml.org is an instance of Discourse that serves as the primary discussion site for OCaml.
Alternate compilers for OCaml:
- js_of_ocaml, developed by the Ocsigen team, is an optimizing compiler from OCaml to JavaScript.
- BuckleScript, which also targets JavaScript, with a focus on producing readable, idiomatic JavaScript output.
- ocamlcc is a compiler from OCaml to C, to complement the native code compiler for unsupported platforms.
- OCamlJava, developed by INRIA, is a compiler from OCaml to the Java virtual machine (JVM).
- OCaPic, developed by Lip6, is an OCaml compiler for PIC microcontrollers.

Code examples

Snippets of OCaml code are most easily studied by entering them into the top-level REPL. This is an interactive OCaml session that prints the inferred types of resulting or defined expressions.^[6] The OCaml top-level is started by simply executing the OCaml program:

$ ocaml
     Objective Caml version 3.09.0
#

Code can then be entered at the "#" prompt. For example, to calculate 1+2*3:

# 1 + 2 * 3;;
- : int = 7

OCaml infers the type of the expression to be "int" (a machine-precision integer) and gives the result "7".

Hello World

The following program "hello.ml":

print_endline "Hello World!"

can be compiled into a bytecode executable:

$ ocamlc hello.ml -o hello

or compiled into an optimized native-code executable:

$ ocamlopt hello.ml -o hello

and executed:

$ ./hello
Hello World!
$

The first argument to ocamlc, "hello.ml", specifies the source file to compile and the "-o hello" flag specifies the output file.^[7]

Summing a list of integers

Lists are one of the fundamental datatypes in OCaml. The following code example defines a recursive function sum that accepts one argument, integers, which is supposed to be a list of integers. Note the keyword rec which denotes that the function is recursive. The function recursively iterates over the given list of integers and provides a sum of the elements. The match statement has similarities to C's switch element, though it is far more general.

let rec sum integers =                   (* Keyword rec means 'recursive'. *)
  match integers with
  | [] -> 0                              (* Yield 0 if integers is the empty 
                                            list []. *)
  | first :: rest -> first + sum rest;;  (* Recursive call if integers is a non-
                                            empty list; first is the first 
                                            element of the list, and rest is a 
                                            list of the rest of the elements, 
                                            possibly []. *)

  # sum [1;2;3;4;5];;
  - : int = 15

Another way is to use standard fold function that works with lists.

let sum integers =
  List.fold_left (fun accumulator x -> accumulator + x) 0 integers;;

  # sum [1;2;3;4;5];;
  - : int = 15

Since the anonymous function is simply the application of the + operator, this can be shortened to:

let sum integers =
  List.fold_left (+) 0 integers

Furthermore, one can omit the list argument by making use of a partial application:

let sum =
  List.fold_left (+) 0

Quicksort

OCaml lends itself to concisely expressing recursive algorithms. The following code example implements an algorithm similar to quicksort that sorts a list in increasing order.

 let rec qsort = function
   | [] -> []
   | pivot :: rest ->
     let is_less x = x < pivot in
     let left, right = List.partition is_less rest in
     qsort left @ [pivot] @ qsort right

Birthday problem

The following program calculates the smallest number of people in a room for whom the probability of completely unique birthdays is less than 50% (the birthday problem, where for 1 person the probability is 365/365 (or 100%), for 2 it is 364/365, for 3 it is 364/365 × 363/365, etc.) (answer = 23).

let year_size = 365.

let rec birthday_paradox prob people =
  let prob = (year_size -. float people) /. year_size *. prob  in
  if prob < 0.5 then
    Printf.printf "answer = %d\n" (people+1)
  else
    birthday_paradox prob (people+1)
;;

birthday_paradox 1.0 1

Church numerals

The following code defines a Church encoding of natural numbers, with successor (succ) and addition (add). A Church numeral n is a higher-order function that accepts a function f and a value x and applies f to x exactly n times. To convert a Church numeral from a functional value to a string, we pass it a function that prepends the string "S" to its input and the constant string "0".

let zero f x = x
let succ n f x = f (n f x)
let one = succ zero
let two = succ (succ zero)
let add n1 n2 f x = n1 f (n2 f x)
let to_string n = n (fun k -> "S" ^ k) "0"
let _ = to_string (add (succ two) two)

Arbitrary-precision factorial function (libraries)

A variety of libraries are directly accessible from OCaml. For example, OCaml has a built-in library for arbitrary-precision arithmetic. As the factorial function grows very rapidly, it quickly overflows machine-precision numbers (typically 32- or 64-bits). Thus, factorial is a suitable candidate for arbitrary-precision arithmetic.

In OCaml, the Num module (now superseded by the ZArith module) provides arbitrary-precision arithmetic and can be loaded into a running top-level using:

# #use "topfind";;
# #require "num";;
# open Num;;

The factorial function may then be written using the arbitrary-precision numeric operators =/, */ and -/ :

# let rec fact n =
    if n =/ Int 0 then Int 1 else n */ fact(n -/ Int 1);;
val fact : Num.num -> Num.num = <fun>

This function can compute much larger factorials, such as 120!:

# string_of_num (fact (Int 120));;
- : string =
"6689502913449127057588118054090372586752746333138029810295671352301633
55724496298936687416527198498130815763789321409055253440858940812185989
8481114389650005964960521256960000000000000000000000000000"

Triangle (graphics)

The following program renders a rotating triangle in 2D using OpenGL:

let () =
  ignore (Glut.init Sys.argv);
  Glut.initDisplayMode ~double_buffer:true ();
  ignore (Glut.createWindow ~title:"OpenGL Demo");
  let angle t = 10. *. t *. t in
  let render () =
    GlClear.clear [ `color ];
    GlMat.load_identity ();
    GlMat.rotate ~angle: (angle (Sys.time ())) ~z:1. ();
    GlDraw.begins `triangles;
    List.iter GlDraw.vertex2 [-1., -1.; 0., 1.; 1., -1.];
    GlDraw.ends ();
    Glut.swapBuffers () in
  GlMat.mode `modelview;
  Glut.displayFunc ~cb:render;
  Glut.idleFunc ~cb:(Some Glut.postRedisplay);
  Glut.mainLoop ()

The LablGL bindings to OpenGL are required. The program may then be compiled to bytecode with:

  $ ocamlc -I +lablGL lablglut.cma lablgl.cma simple.ml -o simple

or to nativecode with:

  $ ocamlopt -I +lablGL lablglut.cmxa lablgl.cmxa simple.ml -o simple

or, more simply, using the ocamlfind build command

  $ ocamlfind opt simple.ml -package lablgl.glut -linkpkg -o simple

and run:

  $ ./simple

Far more sophisticated, high-performance 2D and 3D graphical programs can be developed in OCaml. Thanks to the use of OpenGL and OCaml, the resulting programs can be cross-platform, compiling without any changes on many major platforms.

Fibonacci sequence

The following code calculates the Fibonacci sequence of a number n inputted. It uses tail recursion and pattern matching.

let fib n =
  let rec fib_aux m a b =
    match m with
    | 0 -> a
    | _ -> fib_aux (m - 1) b (a + b)
  in fib_aux n 0 1

Higher-order functions

Functions may take functions as input and return functions as result. For example, applying twice to a function f yields a function that applies f two times to its argument.

let twice (f : 'a -> 'a) = fun (x : 'a) -> f (f x);;
let inc (x : int) : int = x + 1;;
let add2 = twice inc;;
let inc_str (x : string) : string = x ^ " " ^ x;;
let add_str = twice(inc_str);;

  # add2 98;;
  - : int = 100
  # add_str "Test";;
  - : string = "Test Test Test Test"

The function twice uses a type variable 'a to indicate that it can be applied to any function f mapping from a type 'a to itself, rather than only to int->int functions. In particular, twice can even be applied to itself.

  # let fourtimes f = (twice twice) f;;
  val fourtimes : ('a -> 'a) -> 'a -> 'a = <fun>
  # let add4 = fourtimes inc;;
  val add4 : int -> int = <fun>
  # add4 98;;
  - : int = 102

Derived languages

MetaOCaml

MetaOCaml^[8] is a multi-stage programming extension of OCaml enabling incremental compiling of new machine code during runtime. Under some circumstances, significant speedups are possible using multistage programming, because more detailed information about the data to process is available at runtime than at the regular compile time, so the incremental compiler can optimize away many cases of condition checking, etc.

As an example: if at compile time it is known that some power function x -> x^n is needed often, but the value of n is known only at runtime, a two-stage power function can be used in MetaOCaml:

 let rec power n x =
   if n = 0
   then .<1>.
   else
     if even n
     then sqr (power (n/2) x)
     else .<.~x *. .~(power (n - 1) x)>.

As soon as n is known at runtime, a specialized and very fast power function can be created:

 .<fun x -> .~(power 5 .<x>.)>.

The result is:

 fun x_1 -> (x_1 *
     let y_3 = 
         let y_2 = (x_1 * 1)
         in (y_2 * y_2)
     in (y_3 * y_3))

The new function is automatically compiled.

Other derived languages

AtomCaml provides a synchronization primitive for atomic (transactional) execution of code.
Emily (2006) is a subset of OCaml 3.08 that uses a design rule verifier to enforce object-capability model security principles.
F# is a .NET Framework language based on OCaml.
Fresh OCaml facilitates manipulating names and binders.
GCaml adds extensional polymorphism to OCaml, thus allowing overloading and type-safe marshalling.
JoCaml integrates constructions for developing concurrent and distributed programs.
OCamlDuce extends OCaml with features such as XML expressions and regular-expression types.
OCamlP3l is a parallel programming system based on OCaml and the P3L language.
While not truly a separate language, Reason is an alternative OCaml syntax and toolchain for OCaml created at Facebook.

Software written in OCaml

0install, a multi-platform package manager.
Coccinelle, a utility for transforming the source code of C programs.
Coq, a formal proof management system.
FFTW, a library for computing discrete Fourier transforms. Several C routines have been generated by an OCaml program named genfft.
The web version of Facebook Messenger.^[9]
Flow, a static analyzer created at Facebook that infers and checks static types for JavaScript.^[10]
Owl Scientific Computing, a dedicated system for scientific and engineering computing.
Frama-C, a framework for analyzing C programs.
GeneWeb, free and open-source multi-platform genealogy software.
The Hack programming language compiler, created at Facebook, extending PHP with static types.
The Haxe programming language compiler.
HOL Light, a formal proof assistant.
Infer, a static analyzer created at Facebook for Java, C, C++, and Objective-C, used to detect bugs in iOS and Android apps.^[11]
Lexifi Apropos, a system for modeling complex derivatives.
MirageOS, a unikernel programming framework written in pure OCaml.
MLdonkey, a peer-to-peer file sharing application based on the EDonkey network.
Ocsigen, an OCaml web framework.
Opa, a free and open-source programming language for web development.
pyre-check, a type checker for Python created at Facebook.^[12]
Tezos, a self-amending smart contract platform using XTZ as a native currency.
Unison, a file synchronization program to synchronize files between two directories.
The reference interpreter for WebAssembly, a low-level bytecode intended for execution inside web browsers.^[13]
Xen Cloud Platform (XCP), a turnkey virtualization solution for the Xen hypervisor.

Users

Several dozen companies use OCaml to some degree.^[14] Notable examples include:

Bloomberg L.P., which created BuckleScript, an OCaml compiler backend targeting JavaScript.^[15]
Citrix Systems, which uses OCaml in XenServer (rebranded as Citrix Hypervisor during 2018).
Facebook, which developed Flow, Hack, Infer, Pfff, and Reason in OCaml.
Jane Street Capital, a proprietary trading firm, which adopted OCaml as its preferred language in its early days.^[16]
MEDIT, France, uses OCaml for bioformatics.^[14]

References

^ "Modules". Retrieved 22 February 2020.
^ "A History of OCaml". Retrieved 24 December 2016.
^ Linux Weekly News.
^ "ocaml/asmcomp at trunk · ocaml/ocaml · GitHub". GitHub. Retrieved 2 May 2015.
^ "Archives of the Caml mailing list > Message from Xavier Leroy". Retrieved 2 May 2015.
^ "OCaml - The toplevel system or REPL (ocaml)". ocaml.org. Retrieved 2021-05-17.
^ https://caml.inria.fr/pub/docs/manual-ocaml/comp.html
^ oleg-at-okmij.org. "BER MetaOCaml". okmij.org.
^ "Messenger.com Now 50% Converted to Reason · Reason". reasonml.github.io. Retrieved 2018-02-27.
^ "Flow: A Static Type Checker for JavaScript". Flow.
^ "Infer static analyzer". Infer.
^ "GitHub - facebook/pyre-check: Performant type-checking for python". 9 February 2019 – via GitHub.
^ "WebAssembly/spec: WebAssembly specification, reference interpreter, and test suite". World Wide Web Consortium. 5 December 2019. Retrieved 2021-05-14 – via GitHub.
^ ^a ^b "Companies using OCaml". OCaml.org. Retrieved 2021-05-14.
^ "BuckleScript: The 1.0 release has arrived! | Tech at Bloomberg". Tech at Bloomberg. 8 September 2016. Retrieved 21 May 2017.
^ Yaron Minsky (1 November 2011). "OCaml for the Masses". Retrieved 2 May 2015.

External links

[1] "Modules". Retrieved 22 February 2020.

[2] "A History of OCaml". Retrieved 24 December 2016.

[LWN-3] Linux Weekly News.

[4] "ocaml/asmcomp at trunk · ocaml/ocaml · GitHub". GitHub. Retrieved 2 May 2015.

[INRIA-5] "Archives of the Caml mailing list > Message from Xavier Leroy". Retrieved 2 May 2015.

[6] "OCaml - The toplevel system or REPL (ocaml)". ocaml.org. Retrieved 2021-05-17.

[7] ttps://caml.inria.fr/pub/docs/manual-ocaml/comp.html

[Meta-8] -at-okmij.org. "BER MetaOCaml". okmij.org.

[9] "Messenger.com Now 50% Converted to Reason · Reason". reasonml.github.io. Retrieved 2018-02-27.

[10] "Flow: A Static Type Checker for JavaScript". Flow.

[11] "Infer static analyzer". Infer.

[12] "GitHub - facebook/pyre-check: Performant type-checking for python". 9 February 2019 – via GitHub.

[13] "WebAssembly/spec: WebAssembly specification, reference interpreter, and test suite". World Wide Web Consortium. 5 December 2019. Retrieved 2021-05-14 – via GitHub.

[ocaml.org-companies-14] "Companies using OCaml". OCaml.org. Retrieved 2021-05-14.

[15] "BuckleScript: The 1.0 release has arrived! | Tech at Bloomberg". Tech at Bloomberg. 8 September 2016. Retrieved 21 May 2017.

[16] Yaron Minsky (1 November 2011). "OCaml for the Masses". Retrieved 2 May 2015.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

[14]

[15]

[16]