Reading file in OCaml
Reading file in OCaml can be done using In_channel
.
Consider an input file input.txt
helloearth
hellomars
hellojupiter
Short explanation
Reading file in full
let filename = "input.txt" in let contents = In_channel.with_open_text filename In_channel.input_all in String.split_on_char '\n' contents
Reading file line by line
let () = let filename = "input.txt" in In_channel.with_open_text filename (fun in_channel -> let rec loop () = match In_channel.input_line in_channel with | Some line -> print_endline line; loop () | None -> print_endline "~~ End of file reached" in loop ())
This will return us a list of string
In_channel
In_channel
is a module1 that provides functions to work with input channels. This can be a file or a standard input. It also provides useful functions to read a file. For example to read a text file we can use
In_channel.with_open_text "input.txt"
The function takes care of opening and closing the file as well.
Longer explanation
We could read a file as binary or as text, there are functions available for both
.with_open_bin
.with_open_text
We will focus on reading as text for now.
We can read file two ways
- Full into memory
- Line by line (to process on the fly)
Reading full file
let () = let filename = "input.txt" in let contents = In_channel.with_open_text filename In_channel.input_all in let lst = String.split_on_char '\n' contents in List.iter (fun x -> print_endline x) lst
Here we are using In_channel.with_open_text
and In_channel.input_all
to read the full contents into memory.
Reading line by line
let () = let filename = "input.txt" in In_channel.with_open_text filename (fun in_channel -> let rec read_lines () = match In_channel.input_line in_channel with | Some line -> print_endline line; read_lines () (* Continue reading next lines *) | None -> print_endline "\n~~End of file" in read_lines ())
Here we ware using In_channel.with_open_text
and a lambda function to make use of In_channel.input_line
. We will need to iterate recursively until we reach end of file.
Let’s break down why you need to pass (fun in_channel -> ...)
as a second argument to In_channel.with_open_text
instead unlike In_channel.input_all
, plus what is happening with the function arguments.
The function In_channel.with_open_text
expects two arguments: a string (the file path) and a function. This function must take an In_channel.t
(the file channel) as its argument and return a result (often a string or some processed data from the file).
In_channel.input_all
is a function that takes an In_channel.t
and returns a string.
val input_all : In_channel.t -> string
When we pass In_channel.input_all
to In_channel.with_open_text
we are essentially doing a partial application [[ocaml-partial-application]].
As per the above discussion In_channel.with_open_text
expects a function as second argument which should accept In_channel.t
as an input argument and then return a result and In_channel.input_all
fits perfectly here. Hence, we are able to pass it directly.
let file_contents = In_channel.with_open_text "filename.txt" In_channel.input_all
On the other hand, In_channel.input_line
is designed to read a single line from an input channel and its signature is:
val input_line : In_channel.t -> string option
It returns an option
because we can have two results
- A line was available to be read (
Some line
) - End of file has been reached (
None
)
Considering these, we will need to loop until we read all the contents of the file since a single input_line
execution will read only one line. Therefore, we are not able to pass In_channel.input_line
directly. Now, what if we do?
NOTE: We could still pass the function and match
the arguments but then we would be still reading only one line, obviously that is not our goal. This is anyway covered below.
Making changes one by one and observing the changes.
- Replacing
input_all
withinput_line
let () = let filename = "input.txt" in (* Changed input_all to input_line *) let contents = In_channel.with_open_text filename In_channel.input_line in let lst = String.split_on_char '\n' contents in List.iter (fun x -> print_endline x) lst
Result
File "./program.ml", line 4, characters 44-52:4 | let lst = String.split_on_char '\n' contents in ^^^^^^^^Error: This expression has type string option but an expression was expected of type string
That was obvious.
Now further changes to remove the extra processing, we are trying to print the result and we are now expecting only one line, which is the first line.
let () = let filename = "input.txt" in let contents = In_channel.with_open_text filename In_channel.input_line in print_endline contents
Result
File "./program.ml", line 4, characters 22-30:4 | print_endline contents ^^^^^^^^Error: This expression has type string option but an expression was expected of type string
Yes, this is expected as we discussed - we are getting an option from In_channel.input_line
and we need to respect that.
Changing further with match
let () = let filename = "input.txt" in let contents = In_channel.with_open_text filename In_channel.input_line in match contents with | Some line -> print_endline line | None -> print_endline "End of file reached"
Result
❯ ocaml full_file_2.mlhello
Now, this makes sense as we don’t have a loop and now we are only printing the first line of the file.
Finally enough changes to read the full file
let () = let filename = "input.txt" in In_channel.with_open_text filename (fun in_channel -> let rec loop () = match In_channel.input_line in_channel with | Some line -> print_endline line; loop () | None -> print_endline "~~ End of file reached" in loop ())
Result
❯ ocaml program.mlhelloearth
hellomars
hellojupiter~~ End of file reached
Footnotes
Updated on