Details regarding cyclic data structures in Rust and Haskell

This page describes the details with code examples to explain how cyclic data structures can be implemented in Haskell and how in Rust.

In Haskell it is easy like this:

data Syntax = 
      Terminal Char
    | Sequence [ Syntax ]
    | Choice [ Syntax ]
    | Repetition Syntax
    | None
    deriving Show

In Rust it looks like this:

pub enum Syntax {
    Terminal( char ), 
    Sequence( Vec< Syntax > ), 
    Choice( Vec< Syntax >), 
    Repetition( Syntax ), 
    None, 
}

Unfortunately the Rust compiler complains:

error[E0072]: recursive type `Syntax` has infinite size
 --> src/main.rs:3:1
  |
3 | pub enum Syntax {
  | ^^^^^^^^^^^^^^^
...
7 |     Repetition( Syntax ), 
  |                 ------ recursive without indirection
  |
help: insert some indirection (e.g., a `Box`, `Rc`, or `&`) to break the cycle
  |
7 |     Repetition( Box<Syntax> ), 
  |                 ++++      +

As the help suggests, it can be fixed like this:

pub enum Syntax {
    Terminal( char ), 
    Sequence( Vec< Syntax > ), 
    Choice( Vec< Syntax >), 
    Repetition( Box< Syntax > ), 
    None, 
}

The problem begins as soon as you are going to instantiate a cyclic structure making cycles.

Cyclic syntax graph

In Haskell that easy:

synAddOperator = Choice [ Terminal '+', Terminal '-' ]
synAddOperation = Sequence [ synAddOperator, synMulExpression ]
synAddOperations = Repetition synAddOperation
synMulOperator = Choice [ Terminal '*',  Terminal '/' ]
synMulOperation = Sequence [ synMulOperator, synMulExpression ]
synMulOperations = Choice [ synMulOperation, None ]
synNumber = Choice [ Terminal '0', Terminal '1', Terminal '2', Terminal '3' ]
synMulExpression = Sequence [ synNumber, synMulOperations ]
synAddExpression = Sequence [ synMulExpression, synAddOperations ]

In Rust it looks like this:

let syn_add_operator = Syntax::Choice( vec![ Syntax::Terminal( '+' ), Syntax::Terminal( '-' ) ]);
let syn_add_operation = Syntax::Sequence( vec![ syn_add_operator, syn_mul_expression ] );
let syn_add_operations = Syntax::Repetition( Box::new( syn_add_operation ) );
let syn_mul_operator = Syntax::Choice( vec! [ Syntax::Terminal( '*' ),  Syntax::Terminal( '/' ) ] );
let syn_mul_operation = Syntax::Sequence( vec! [ syn_mul_operator, syn_mul_expression ] );
let syn_mul_operations = Syntax::Choice( vec! [ syn_mul_operation, Syntax::None ] );
let syn_number = Syntax::Choice( vec! [ Syntax::Terminal( '0' ), Syntax::Terminal( '1' ), Syntax::Terminal( '2' ), Syntax::Terminal( '3' ) ] );
let syn_mul_expression = Syntax::Sequence( vec! [ syn_number, syn_mul_operations ] );
let syn_add_expression = Syntax::Sequence( vec! [ syn_mul_expression, syn_add_operations ] );

Unfortunately the Rust compiler complains again:

error[E0425]: cannot find value `syn_mul_expression` in this scope
  --> src/main.rs:16:71
   |
16 |     let syn_add_operation = Syntax::Sequence( vec![ syn_add_operator, syn_mul_expression ] );
   |                                                                       ^^^^^^^^^^^^^^^^^^ not found in this scope

error[E0425]: cannot find value `syn_mul_expression` in this scope
  --> src/main.rs:25:72
   |
25 |     let syn_mul_operation = Syntax::Sequence( vec! [ syn_mul_operator, syn_mul_expression ] );
   |                                                                        ^^^^^^^^^^^^^^^^^^ not found in this scope

…because syn_mul_expression is not declare before syn_add_operation and syn_mul_operation, respectively.

It is impossible to solve this problem by changing the order of instantiations.

To solve this you have to use traits like Rc and RefCell, may be HashMaps and macros to come with an elegant declarative way. Additionally, the counting reference instances (trait Rc) of the cycles have to be detected and forced to free the memory.