Showing posts with label programming. Show all posts
Showing posts with label programming. Show all posts

2014-07-21

C++11

In these weeks I am struggling with some large C++ codebase I need to extend and port to a new architecture. Its original developers did not care of commenting the code, nor did they produce any documentation explaining the inner details of this beast. Variable and function names are sometimes unrelated to the true purpose of the object they name. So my task is quite painful, and the characteristics of the C++ language are not helping me much. In this post I'll describe several problems I am finding about C++. It is a follow-up of my previous post, What Ada and D taught me about C++.

C++11 + Boost? Nice! (In theory)

I got quite interested when I saw the new features in the C++11 standard: the auto keyword, enum class, and vector initializers. Since C++11 is compatible with the old C++98 used in the code I am porting, I decided to use it whenever I had to modify/add something. In addition to the facilities provided by bare C++11, I knew I would have needed the Boost libraries, as I had to add support for property trees and for formatted messages, as well as automated tests.

"Standard" libraries? Oh my!

I started developing a few new modules using my laptop (Linux Mint 17, with GCC 4.8.2). Things went quite smooth: C++11 is much nicer than the old C++98 I was used to, apart from the slow compilation times (once you used Free Pascal once, you're always comparing compilation speeds to it!). But once I tried to deploy the code on the production platform, things immediately got much worse!

The platform for which I am porting this code is a cluster with a quite old version of GCC (4.4.6). I already knew that C++11 was a no-option with GCC 4.4, so I already compiled and installed GCC 4.8.0 under my homedir a few months ago. So I set up the environment variables in order to use my own GCC, and fired make.

BANG! A series of error messages involving Boost::Spirit (which I never used, but it's included by Boost::Property_tree) popped on the screen. I think I set a new world record: how can an error message be thousands of characters long?!?

/usr/include/boost/property_tree/detail/json_parser_read.hpp: In
instantiation of ‘void boost::property_tree::json_parser::contex
t<Ptree>::a_literal_val::operator()(boost::property_tree::json_p
arser::context<Ptree>::It, boost::property_tree::json_parser::co
ntext<Ptree>::It) const [with Ptree = boost::property_tree::basi
c_ptree<std::basic_string<char>, std::basic_string<char> >; boos
t::property_tree::json_parser::context<Ptree>::It = __gnu_cxx::_
_normal_iterator<char*, std::vector<char, std::allocator<char> > >]’:
/usr/include/boost/spirit/home/classic/core/scanner/scanner.hpp:
148:30:   required from ‘static void boost::spirit::classic::att
ributed_action_policy<boost::spirit::classic::nil_t>::call(const
ActorT&, boost::spirit::classic::nil_t, const IteratorT&, const
IteratorT&) [with ActorT = boost::property_tree::json_parser::co
ntext<boost::property_tree::basic_ptree<std::basic_string<char>,
std::basic_string<char> > >::a_literal_val; IteratorT = __gnu_cx
x::__normal_iterator<char*, std::vector<char, std::allocator<cha
r> > >]’
/usr/include/boost/spirit/home/classic/core/scanner/scanner.hpp:
163:74:   required from ‘void boost::spirit::classic::action_pol
icy::do_action(const ActorT&, AttrT&, const IteratorT&, const It
eratorT&) const [with ActorT = boost::property_tree::json_parser
::context<boost::property_tree::basic_ptree<std::basic_string<ch
ar>, std::basic_string<char> > >::a_literal_val; AttrT = boost::
spirit::classic::nil_t; IteratorT = __gnu_cxx::__normal_iterator
<char*, std::vector<char, std::allocator<char> > >]’
/usr/include/boost/spirit/home/classic/core/composite/actions.hp
p:116:17:   required from ‘typename boost::spirit::classic::pars
er_result<boost::spirit::classic::action<ParserT, ActionT>, Scan
nerT>::type boost::spirit::classic::action<ParserT, ActionT>::pa
rse(const ScannerT&) const [with ScannerT = boost::spirit::class
ic::scanner<__gnu_cxx::__normal_iterator<char*, std::vector<char
, std::allocator<char> > >, boost::spirit::classic::scanner_poli
cies<boost::spirit::classic::skip_parser_iteration_policy<boost:
:spirit::classic::alternative<boost::spirit::classic::alternativ
e<boost::spirit::classic::space_parser, boost::spirit::classic::
confix_parser<boost::spirit::classic::strlit<const char*>, boost
::spirit::classic::kleene_star<boost::spirit::classic::anychar_p
arser>, boost::spirit::classic::alternative<boost::spirit::class
ic::eol_parser, boost::spirit::classic::end_parser>, boost::spir
it::classic::unary_parser_category, boost::spirit::classic::non_
nested, boost::spirit::classic::is_lexeme> >, boost::spirit::cla
ssic::confix_parser<boost::spirit::classic::strlit<const char*>,
boost::spirit::classic::kleene_star<boost::spirit::classic::anyc
har_parser>, boost::spirit::classic::strlit<const char*>, boost:
:spirit::classic::unary_parser_category, boost::spirit::classic:
:non_nested, boost::spirit::classic::is_lexeme> >, boost::spirit
::classic::iteration_policy>, boost::spirit::classic::match_poli
cy, boost::spirit::classic::action_policy> >; ParserT = boost::s
pirit::classic::alternative<boost::spirit::classic::alternative<
boost::spirit::classic::alternative<boost::spirit::classic::rule
<boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<ch
ar*, std::vector<char, std::allocator<char> > >, boost::spirit::
classic::scanner_policies<boost::spirit::classic::skip_parser_it
eration_policy<boost::spirit::classic::alternative<boost::spirit
::classic::alternative<boost::spirit::classic::space_parser, boo
st::spirit::classic::confix_parser<boost::spirit::classic::strli
t<const char*>, boost::spirit::classic::kleene_star<boost::spiri
t::classic::anychar_parser>, boost::spirit::classic::alternative
<boost::spirit::classic::eol_parser, boost::spirit::classic::end
_parser>, boost::spirit::classic::unary_parser_category, boost::
spirit::classic::non_nested, boost::spirit::classic::is_lexeme>
>, boost::spirit::classic::confix_parser<boost::spirit::classic:
:strlit<const char*>, boost::spirit::classic::kleene_star<boost:
:spirit::classic::anychar_parser>, boost::spirit::classic::strli
t<const char*>, boost::spirit::classic::unary_parser_category, b
oost::spirit::classic::non_nested, boost::spirit::classic::is_le
xeme> >, boost::spirit::classic::iteration_policy>, boost::spiri
t::classic::match_policy, boost::spirit::classic::action_policy>
>, boost::spirit::classic::nil_t, boost::spirit::classic::nil_t>
, boost::spirit::classic::strlit<const char*> >, boost::spirit::
classic::strlit<const char*> >, boost::spirit::classic::strlit<c
onst char*> >; ActionT = boost::property_tree::json_parser::cont
ext<boost::property_tree::basic_ptree<std::basic_string<char>, s
td::basic_string<char> > >::a_literal_val; typename boost::spiri
t::classic::parser_result<boost::spirit::classic::action<ParserT
, ActionT>, ScannerT>::type = boost::spirit::classic::match<boos
t::spirit::classic::nil_t>]’
/usr/include/boost/spirit/home/classic/core/composite/alternativ
e.hpp:67:59:   recursively required from ‘typename boost::spirit
::classic::parser_result<boost::spirit::classic::alternative<A,
B>, ScannerT>::type boost::spirit::classic::alternative<A, B>::p
arse(const ScannerT&) const [with ScannerT = boost::spirit::clas
sic::scanner<__gnu_cxx::__normal_iterator<char*, std::vector<cha
r, std::allocator<char> > >, boost::spirit::classic::scanner_pol
icies<boost::spirit::classic::skip_parser_iteration_policy<boost
::spirit::classic::alternative<boost::spirit::classic::alternati
ve<boost::spirit::classic::space_parser, boost::spirit::classic:
:confix_parser<boost::spirit::classic::strlit<const char*>, boos
t::spirit::classic::kleene_star<boost::spirit::classic::anychar_
parser>, boost::spirit::classic::alternative<boost::spirit::clas
sic::eol_parser, boost::spirit::classic::end_parser>, boost::spi
rit::classic::unary_parser_category, boost::spirit::classic::non
_nested, boost::spirit::classic::is_lexeme> >, boost::spirit::cl
assic::confix_parser<boost::spirit::classic::strlit<const char*>
, boost::spirit::classic::kleene_star<boost::spirit::classic::an
ychar_parser>, boost::spirit::classic::strlit<const char*>, boos
t::spirit::classic::unary_parser_category, boost::spirit::classi
c::non_nested, boost::spirit::classic::is_lexeme> >, boost::spir
it::classic::iteration_policy>, boost::spirit::classic::match_po
licy, boost::spirit::classic::action_policy> >; A = boost::spiri
t::classic::alternative<boost::spirit::classic::action<boost::sp
irit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::_
_normal_iterator<char*, std::vector<char, std::allocator<char> >
>, boost::spirit::classic::scanner_policies<boost::spirit::class
ic::skip_parser_iteration_policy<boost::spirit::classic::alterna
tive<boost::spirit::classic::alternative<boost::spirit::classic:
:space_parser, boost::spirit::classic::confix_parser<boost::spir
it::classic::strlit<const char*>, boost::spirit::classic::kleene
_star<boost::spirit::classic::anychar_parser>, boost::spirit::cl
assic::alternative<boost::spirit::classic::eol_parser, boost::sp
irit::classic::end_parser>, boost::spirit::classic::unary_parser
_category, boost::spirit::classic::non_nested, boost::spirit::cl
assic::is_lexeme> >, boost::spirit::classic::confix_parser<boost
::spirit::classic::strlit<const char*>, boost::spirit::classic::
kleene_star<boost::spirit::classic::anychar_parser>, boost::spir
it::classic::strlit<const char*>, boost::spirit::classic::unary_
parser_category, boost::spirit::classic::non_nested, boost::spir
it::classic::is_lexeme> >, boost::spirit::classic::iteration_pol
icy>, boost::spirit::classic::match_policy, boost::spirit::class
ic::action_policy> >, boost::spirit::classic::nil_t, boost::spir
it::classic::nil_t>, boost::property_tree::json_parser::context<
boost::property_tree::basic_ptree<std::basic_string<char>, std::
basic_string<char> > >::a_string_val>, boost::spirit::classic::a
ction<boost::spirit::classic::alternative<boost::spirit::classic
::alternative<boost::spirit::classic::alternative<boost::spirit:
:classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__norm
al_iterator<char*, std::vector<char, std::allocator<char> > >, b
oost::spirit::classic::scanner_policies<boost::spirit::classic::
skip_parser_iteration_policy<boost::spirit::classic::alternative
<boost::spirit::classic::alternative<boost::spirit::classic::spa
ce_parser, boost::spirit::classic::confix_parser<boost::spirit::
classic::strlit<const char*>, boost::spirit::classic::kleene_sta
r<boost::spirit::classic::anychar_parser>, boost::spirit::classi
c::alternative<boost::spirit::classic::eol_parser, boost::spirit
::classic::end_parser>, boost::spirit::classic::unary_parser_cat
egory, boost::spirit::classic::non_nested, boost::spirit::classi
c::is_lexeme> >, boost::spirit::classic::confix_parser<boost::sp
irit::classic::strlit<const char*>, boost::spirit::classic::klee
ne_star<boost::spirit::classic::anychar_parser>, boost::spirit::
classic::strlit<const char*>, boost::spirit::classic::unary_pars
er_category, boost::spirit::classic::non_nested, boost::spirit::
classic::is_lexeme> >, boost::spirit::classic::iteration_policy>
, boost::spirit::classic::match_policy, boost::spirit::classic::
action_policy> >, boost::spirit::classic::nil_t, boost::spirit::
classic::nil_t>, boost::spirit::classic::strlit<const char*> >,
boost::spirit::classic::strlit<const char*> >, boost::spirit::cl
assic::strlit<const char*> >, boost::property_tree::json_parser:
:context<boost::property_tree::basic_ptree<std::basic_string<cha
r>, std::basic_string<char> > >::a_literal_val> >; B = boost::sp
irit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::_
_normal_iterator<char*, std::vector<char, std::allocator<char> >
>, boost::spirit::classic::scanner_policies<boost::spirit::class
ic::skip_parser_iteration_policy<boost::spirit::classic::alterna
tive<boost::spirit::classic::alternative<boost::spirit::classic:
:space_parser, boost::spirit::classic::confix_parser<boost::spir
it::classic::strlit<const char*>, boost::spirit::classic::kleene
_star<boost::spirit::classic::anychar_parser>, boost::spirit::cl
assic::alternative<boost::spirit::classic::eol_parser, boost::sp
irit::classic::end_parser>, boost::spirit::classic::unary_parser
_category, boost::spirit::classic::non_nested, boost::spirit::cl
assic::is_lexeme> >, boost::spirit::classic::confix_parser<boost
::spirit::classic::strlit<const char*>, boost::spirit::classic::
kleene_star<boost::spirit::classic::anychar_parser>, boost::spir
it::classic::strlit<const char*>, boost::spirit::classic::unary_
parser_category, boost::spirit::classic::non_nested, boost::spir
it::classic::is_lexeme> >, boost::spirit::classic::iteration_pol
icy>, boost::spirit::classic::match_policy, boost::spirit::class
ic::action_policy> >, boost::spirit::classic::nil_t, boost::spir
it::classic::nil_t>; typename boost::spirit::classic::parser_res
ult<boost::spirit::classic::alternative<A, B>, ScannerT>::type =
boost::spirit::classic::match<boost::spirit::classic::nil_t>]’
/usr/include/boost/spirit/home/classic/core/composite/alternativ
e.hpp:67:59:   required from ‘typename boost::spirit::classic::p
arser_result<boost::spirit::classic::alternative<A, B>, ScannerT
>::type boost::spirit::classic::alternative<A, B>::parse(const S
cannerT&) const [with ScannerT = boost::spirit::classic::scanner
<__gnu_cxx::__normal_iterator<char*, std::vector<char, std::allo
cator<char> > >, boost::spirit::classic::scanner_policies<boost:
:spirit::classic::skip_parser_iteration_policy<boost::spirit::cl
assic::alternative<boost::spirit::classic::alternative<boost::sp
irit::classic::space_parser, boost::spirit::classic::confix_pars
er<boost::spirit::classic::strlit<const char*>, boost::spirit::c
lassic::kleene_star<boost::spirit::classic::anychar_parser>, boo
st::spirit::classic::alternative<boost::spirit::classic::eol_par
ser, boost::spirit::classic::end_parser>, boost::spirit::classic
::unary_parser_category, boost::spirit::classic::non_nested, boo
st::spirit::classic::is_lexeme> >, boost::spirit::classic::confi
x_parser<boost::spirit::classic::strlit<const char*>, boost::spi
rit::classic::kleene_star<boost::spirit::classic::anychar_parser
>, boost::spirit::classic::strlit<const char*>, boost::spirit::c
lassic::unary_parser_category, boost::spirit::classic::non_neste
d, boost::spirit::classic::is_lexeme> >, boost::spirit::classic:
:iteration_policy>, boost::spirit::classic::match_policy, boost:
:spirit::classic::action_policy> >; A = boost::spirit::classic::
alternative<boost::spirit::classic::alternative<boost::spirit::c
lassic::action<boost::spirit::classic::rule<boost::spirit::class
ic::scanner<__gnu_cxx::__normal_iterator<char*, std::vector<char
, std::allocator<char> > >, boost::spirit::classic::scanner_poli
cies<boost::spirit::classic::skip_parser_iteration_policy<boost:
:spirit::classic::alternative<boost::spirit::classic::alternativ
e<boost::spirit::classic::space_parser, boost::spirit::classic::
confix_parser<boost::spirit::classic::strlit<const char*>, boost
::spirit::classic::kleene_star<boost::spirit::classic::anychar_p
arser>, boost::spirit::classic::alternative<boost::spirit::class
ic::eol_parser, boost::spirit::classic::end_parser>, boost::spir
it::classic::unary_parser_category, boost::spirit::classic::non_
nested, boost::spirit::classic::is_lexeme> >, boost::spirit::cla
ssic::confix_parser<boost::spirit::classic::strlit<const char*>,
boost::spirit::classic::kleene_star<boost::spirit::classic::anyc
har_parser>, boost::spirit::classic::strlit<const char*>, boost:
:spirit::classic::unary_parser_category, boost::spirit::classic:
:non_nested, boost::spirit::classic::is_lexeme> >, boost::spirit
::classic::iteration_policy>, boost::spirit::classic::match_poli
cy, boost::spirit::classic::action_policy> >, boost::spirit::cla
ssic::nil_t, boost::spirit::classic::nil_t>, boost::property_tre
e::json_parser::context<boost::property_tree::basic_ptree<std::b
asic_string<char>, std::basic_string<char> > >::a_string_val>, b
oost::spirit::classic::action<boost::spirit::classic::alternativ
e<boost::spirit::classic::alternative<boost::spirit::classic::al
ternative<boost::spirit::classic::rule<boost::spirit::classic::s
canner<__gnu_cxx::__normal_iterator<char*, std::vector<char, std
::allocator<char> > >, boost::spirit::classic::scanner_policies<
boost::spirit::classic::skip_parser_iteration_policy<boost::spir
it::classic::alternative<boost::spirit::classic::alternative<boo
st::spirit::classic::space_parser, boost::spirit::classic::confi
x_parser<boost::spirit::classic::strlit<const char*>, boost::spi
rit::classic::kleene_star<boost::spirit::classic::anychar_parser
>, boost::spirit::classic::alternative<boost::spirit::classic::e
ol_parser, boost::spirit::classic::end_parser>, boost::spirit::c
lassic::unary_parser_category, boost::spirit::classic::non_neste
d, boost::spirit::classic::is_lexeme> >, boost::spirit::classic:
:confix_parser<boost::spirit::classic::strlit<const char*>, boos
t::spirit::classic::kleene_star<boost::spirit::classic::anychar_
parser>, boost::spirit::classic::strlit<const char*>, boost::spi
rit::classic::unary_parser_category, boost::spirit::classic::non
_nested, boost::spirit::classic::is_lexeme> >, boost::spirit::cl
assic::iteration_policy>, boost::spirit::classic::match_policy,
boost::spirit::classic::action_policy> >, boost::spirit::classic
::nil_t, boost::spirit::classic::nil_t>, boost::spirit::classic:
:strlit<const char*> >, boost::spirit::classic::strlit<const cha
r*> >, boost::spirit::classic::strlit<const char*> >, boost::pro
perty_tree::json_parser::context<boost::property_tree::basic_ptr
ee<std::basic_string<char>, std::basic_string<char> > >::a_liter
al_val> >, boost::spirit::classic::rule<boost::spirit::classic::
scanner<__gnu_cxx::__normal_iterator<char*, std::vector<char, st
d::allocator<char> > >, boost::spirit::classic::scanner_policies
<boost::spirit::classic::skip_parser_iteration_policy<boost::spi
rit::classic::alternative<boost::spirit::classic::alternative<bo
ost::spirit::classic::space_parser, boost::spirit::classic::conf
ix_parser<boost::spirit::classic::strlit<const char*>, boost::sp
irit::classic::kleene_star<boost::spirit::classic::anychar_parse
r>, boost::spirit::classic::alternative<boost::spirit::classic::
eol_parser, boost::spirit::classic::end_parser>, boost::spirit::
classic::unary_parser_category, boost::spirit::classic::non_nest
ed, boost::spirit::classic::is_lexeme> >, boost::spirit::classic
::confix_parser<boost::spirit::classic::strlit<const char*>, boo
st::spirit::classic::kleene_star<boost::spirit::classic::anychar
_parser>, boost::spirit::classic::strlit<const char*>, boost::sp
irit::classic::unary_parser_category, boost::spirit::classic::no
n_nested, boost::spirit::classic::is_lexeme> >, boost::spirit::c
lassic::iteration_policy>, boost::spirit::classic::match_policy,
boost::spirit::classic::action_policy> >, boost::spirit::classic
::nil_t, boost::spirit::classic::nil_t> >; B = boost::spirit::cl
assic::rule<boost::spirit::classic::scanner<__gnu_cxx::__normal_
iterator<char*, std::vector<char, std::allocator<char> > >, boos
t::spirit::classic::scanner_policies<boost::spirit::classic::ski
p_parser_iteration_policy<boost::spirit::classic::alternative<bo
ost::spirit::classic::alternative<boost::spirit::classic::space_
parser, boost::spirit::classic::confix_parser<boost::spirit::cla
ssic::strlit<const char*>, boost::spirit::classic::kleene_star<b
oost::spirit::classic::anychar_parser>, boost::spirit::classic::
alternative<boost::spirit::classic::eol_parser, boost::spirit::c
lassic::end_parser>, boost::spirit::classic::unary_parser_catego
ry, boost::spirit::classic::non_nested, boost::spirit::classic::
is_lexeme> >, boost::spirit::classic::confix_parser<boost::spiri
t::classic::strlit<const char*>, boost::spirit::classic::kleene_
star<boost::spirit::classic::anychar_parser>, boost::spirit::cla
ssic::strlit<const char*>, boost::spirit::classic::unary_parser_
category, boost::spirit::classic::non_nested, boost::spirit::cla
ssic::is_lexeme> >, boost::spirit::classic::iteration_policy>, b
oost::spirit::classic::match_policy, boost::spirit::classic::act
ion_policy> >, boost::spirit::classic::nil_t, boost::spirit::cla
ssic::nil_t>; typename boost::spirit::classic::parser_result<boo
st::spirit::classic::alternative<A, B>, ScannerT>::type = boost:
:spirit::classic::match<boost::spirit::classic::nil_t>]’
/usr/include/boost/spirit/home/classic/core/non_terminal/impl/ru
le.ipp:240:36:   required from ‘typename boost::spirit::classic:
:match_result<ScannerT, ContextResultT>::type boost::spirit::cla
ssic::impl::concrete_parser<ParserT, ScannerT, AttrT>::do_parse_
virtual(const ScannerT&) const [with ParserT = boost::spirit::cl
assic::alternative<boost::spirit::classic::alternative<boost::sp
irit::classic::alternative<boost::spirit::classic::action<boost:
:spirit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx
::__normal_iterator<char*, std::vector<char, std::allocator<char
> > >, boost::spirit::classic::scanner_policies<boost::spirit::c
lassic::skip_parser_iteration_policy<boost::spirit::classic::alt
ernative<boost::spirit::classic::alternative<boost::spirit::clas
sic::space_parser, boost::spirit::classic::confix_parser<boost::
spirit::classic::strlit<const char*>, boost::spirit::classic::kl
eene_star<boost::spirit::classic::anychar_parser>, boost::spirit
::classic::alternative<boost::spirit::classic::eol_parser, boost
::spirit::classic::end_parser>, boost::spirit::classic::unary_pa
rser_category, boost::spirit::classic::non_nested, boost::spirit
::classic::is_lexeme> >, boost::spirit::classic::confix_parser<b
oost::spirit::classic::strlit<const char*>, boost::spirit::class
ic::kleene_star<boost::spirit::classic::anychar_parser>, boost::
spirit::classic::strlit<const char*>, boost::spirit::classic::un
ary_parser_category, boost::spirit::classic::non_nested, boost::
spirit::classic::is_lexeme> >, boost::spirit::classic::iteration
_policy>, boost::spirit::classic::match_policy, boost::spirit::c
lassic::action_policy> >, boost::spirit::classic::nil_t, boost::
spirit::classic::nil_t>, boost::property_tree::json_parser::cont
ext<boost::property_tree::basic_ptree<std::basic_string<char>, s
td::basic_string<char> > >::a_string_val>, boost::spirit::classi
c::action<boost::spirit::classic::alternative<boost::spirit::cla
ssic::alternative<boost::spirit::classic::alternative<boost::spi
rit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__
normal_iterator<char*, std::vector<char, std::allocator<char> >
>, boost::spirit::classic::scanner_policies<boost::spirit::class
ic::skip_parser_iteration_policy<boost::spirit::classic::alterna
tive<boost::spirit::classic::alternative<boost::spirit::classic:
:space_parser, boost::spirit::classic::confix_parser<boost::spir
it::classic::strlit<const char*>, boost::spirit::classic::kleene
_star<boost::spirit::classic::anychar_parser>, boost::spirit::cl
assic::alternative<boost::spirit::classic::eol_parser, boost::sp
irit::classic::end_parser>, boost::spirit::classic::unary_parser
_category, boost::spirit::classic::non_nested, boost::spirit::cl
assic::is_lexeme> >, boost::spirit::classic::confix_parser<boost
::spirit::classic::strlit<const char*>, boost::spirit::classic::
kleene_star<boost::spirit::classic::anychar_parser>, boost::spir
it::classic::strlit<const char*>, boost::spirit::classic::unary_
parser_category, boost::spirit::classic::non_nested, boost::spir
it::classic::is_lexeme> >, boost::spirit::classic::iteration_pol
icy>, boost::spirit::classic::match_policy, boost::spirit::class
ic::action_policy> >, boost::spirit::classic::nil_t, boost::spir
it::classic::nil_t>, boost::spirit::classic::strlit<const char*>
>, boost::spirit::classic::strlit<const char*> >, boost::spirit:
:classic::strlit<const char*> >, boost::property_tree::json_pars
er::context<boost::property_tree::basic_ptree<std::basic_string<
char>, std::basic_string<char> > >::a_literal_val> >, boost::spi
rit::classic::rule<boost::spirit::classic::scanner<__gnu_cxx::__
normal_iterator<char*, std::vector<char, std::allocator<char> >
>, boost::spirit::classic::scanner_policies<boost::spirit::class
ic::skip_parser_iteration_policy<boost::spirit::classic::alterna
tive<boost::spirit::classic::alternative<boost::spirit::classic:
:space_parser, boost::spirit::classic::confix_parser<boost::spir
it::classic::strlit<const char*>, boost::spirit::classic::kleene
_star<boost::spirit::classic::anychar_parser>, boost::spirit::cl
assic::alternative<boost::spirit::classic::eol_parser, boost::sp
irit::classic::end_parser>, boost::spirit::classic::unary_parser
_category, boost::spirit::classic::non_nested, boost::spirit::cl
assic::is_lexeme> >, boost::spirit::classic::confix_parser<boost
::spirit::classic::strlit<const char*>, boost::spirit::classic::
kleene_star<boost::spirit::classic::anychar_parser>, boost::spir
it::classic::strlit<const char*>, boost::spirit::classic::unary_
parser_category, boost::spirit::classic::non_nested, boost::spir
it::classic::is_lexeme> >, boost::spirit::classic::iteration_pol
icy>, boost::spirit::classic::match_policy, boost::spirit::class
ic::action_policy> >, boost::spirit::classic::nil_t, boost::spir
it::classic::nil_t> >, boost::spirit::classic::rule<boost::spiri
t::classic::scanner<__gnu_cxx::__normal_iterator<char*, std::vec
tor<char, std::allocator<char> > >, boost::spirit::classic::scan
ner_policies<boost::spirit::classic::skip_parser_iteration_polic
y<boost::spirit::classic::alternative<boost::spirit::classic::al
ternative<boost::spirit::classic::space_parser, boost::spirit::c
lassic::confix_parser<boost::spirit::classic::strlit<const char*
>, boost::spirit::classic::kleene_star<boost::spirit::classic::a
nychar_parser>, boost::spirit::classic::alternative<boost::spiri
t::classic::eol_parser, boost::spirit::classic::end_parser>, boo
st::spirit::classic::unary_parser_category, boost::spirit::class
ic::non_nested, boost::spirit::classic::is_lexeme> >, boost::spi
rit::classic::confix_parser<boost::spirit::classic::strlit<const
char*>, boost::spirit::classic::kleene_star<boost::spirit::class
ic::anychar_parser>, boost::spirit::classic::strlit<const char*>
, boost::spirit::classic::unary_parser_category, boost::spirit::
classic::non_nested, boost::spirit::classic::is_lexeme> >, boost
::spirit::classic::iteration_policy>, boost::spirit::classic::ma
tch_policy, boost::spirit::classic::action_policy> >, boost::spi
rit::classic::nil_t, boost::spirit::classic::nil_t> >; ScannerT
= boost::spirit::classic::scanner<__gnu_cxx::__normal_iterator<c
har*, std::vector<char, std::allocator<char> > >, boost::spirit:
:classic::scanner_policies<boost::spirit::classic::skip_parser_i
teration_policy<boost::spirit::classic::alternative<boost::spiri
t::classic::alternative<boost::spirit::classic::space_parser, bo
ost::spirit::classic::confix_parser<boost::spirit::classic::strl
it<const char*>, boost::spirit::classic::kleene_star<boost::spir
it::classic::anychar_parser>, boost::spirit::classic::alternativ
e<boost::spirit::classic::eol_parser, boost::spirit::classic::en
d_parser>, boost::spirit::classic::unary_parser_category, boost:
:spirit::classic::non_nested, boost::spirit::classic::is_lexeme>
>, boost::spirit::classic::confix_parser<boost::spirit::classic:
:strlit<const char*>, boost::spirit::classic::kleene_star<boost:
:spirit::classic::anychar_parser>, boost::spirit::classic::strli
t<const char*>, boost::spirit::classic::unary_parser_category, b
oost::spirit::classic::non_nested, boost::spirit::classic::is_le
xeme> >, boost::spirit::classic::iteration_policy>, boost::spiri
t::classic::match_policy, boost::spirit::classic::action_policy>
>; AttrT = boost::spirit::classic::nil_t; typename boost::spirit
::classic::match_result<ScannerT, ContextResultT>::type = boost:
:spirit::classic::match<boost::spirit::classic::nil_t>]’

(I artificially put newlines each 64 characters.) This is pure madness.

After some googling, I discovered that this kind of problems is most likely caused by my Boost distribution (1.41) being older than GCC (4.8.0). So I had to download and install Boost's latest version (1.55) in order to make my code compile in the production environment. And this was not easy: despite the fact that AutoConf was able to use the newer Boost library installed under my home, the compile kept trying to #include files from /usr/include. I had to uninstall the system Boost header files in order to make it compile!

I think this story shows that a de facto standard like Boost cannot be kept separated from the C++ compiler. I know how C++ standardisation works, but from the point of view of an user, having such a huge number of useful libraries not included in the C++ standard is utterly unacceptable. Boost provides a number of indispensable facilities:

  1. Unit test library
  2. JSON parsing (sort of)
  3. Formatted strings
  4. File system access routines
  5. Priority queues
  6. Logging library
  7. Multiprecision arithmetic
  8. Library to parse program options provided through the command line
  9. Regular expressions (I know, the C++11 standard has them, but GCC has yet to implement them.)

I cannot resist but underline that the standard libraries provided by Free Pascal, D, and Ada already have many (if not all) of these. Even languages yet to be released like Nimrod and Rust provide much more than C++11!

Headers and dependencies

As I mentioned in one of my previous posts, one of C++'s most serious problems is the way header files are handled. As a short recap, consider this tiny library to add two numbers:

// File: sum.hpp
int sum(int a, int b);

// File: sum.cpp
int sum(int a, int b) {
    return a + b;
}

This library is used by the following program:

// File: main.cpp
#include "sum.hpp"
#include <iostream>

int main() {
    int a = 1, b = 2;
    std::cout << "The sum of " << a << " and " << b << " is "
              << sum(a, b);
    return 0;
}

A very simple Makefile is used to compile the program and the library:

main: main.o sum.o

(no need to call g++ explicitly: GNU Make is smart enough to do that.) Now, suppose somebody changes file sum.hpp in order to make calculations using floating-point variables, but they forget to update main.cpp:

// File: sum.hpp
double sum(double a, double b);

// File: sum.cpp
double sum(double a, double b) {
    return a + b;
}

When I run make, it will detect that sum.cpp has changed and will therefore recompile it. However, make has no way to know that main.cpp needs to be recompiled too (because it depends on sum.hpp, which has changed). Therefore, main.o will be left untouched, still believing that sum's arguments are two four-byte integers instead of two floating-point variables. Random errors are very likely to occur!

Several methods have been devised to make GNU Make aware of the dependency of source files on headers. In the case of my codebase, I chose to use GNU Automake (this was far from trivial too, but I will not discuss this topic here). Such a burden should be removed from the programmer's back and be managed by the compiler itself.

Compare this situation with Ada's packages, D modules, and Free Pascal units. In these cases, the compile is able to understand exactly which module needs to be recompiled, without even the need of a Makefile. (Nimrod has the same ability.)

Compilation times

I already mentioned above that compilation times are still horrible with C++11. This has a number of causes. In order of importance, they are:

  1. Boost heavily relies on templates, which in principle should be compiled every time you #include them in your code. (Modern C++ compilers can use precompiled headers; however, their use is tricky.)
  2. Apart from Boost, every header file needs to be compiled every time you include it. (But the header files used in my app are not as long as Boost's.)
  3. C++ is a complex language, and therefore it takes more time to parse it than others.

The primary cause for this is the baroque usage of the C/C++ preprocessor to #include header files. A new module system able to replace the preprocessor would surely help here! Unfortunately, despite some interesting attempts, this was not included in the C++11 standard. (This is apparently not due to lack of motivation, but because of the difficulty in devising a solution which does not break compatibility, when dealing with preprocessor macros. So, the preprocessor is to be blamed here too.) The approach followed by Clang developers seems promising (btw, their page nicely sums up the current state of things with the C++11 preprocessor), and the C++ standardisation committee is trying to have such module system ready in time for the C++17 (they already know it will not ready for C++14.)

Clang's proposal is described in this presentation by Doug Gregor. That's very nice, and it is exactly what C++ needs. But I cannot avoid thinking that similar solutions were already present in the first specification of the Ada language (1983, the same year C++ appeared). And Borland introduced units in Turbo Pascal 4 (1987). Is it possible that the latest C++ standard, released in 2011, still lacks a decent solution for a problem for which much sounder solutions were already available 30 years ago?!?

In the last weeks, I often have found myself thinking of porting this gargantuan C++ codebase to Pascal or Ada. Surely it would require a huge work to rewrite everything from scratch, but it would nevertheless be a very interesting exercise!

2012-11-09

What Ada and D taught me about C++

In the last weeks I needed to write some C++ code and immediately had strong dislike for the language. This has surprised me a bit, since I have used C++ for almost 20 years. (My first C++ compiler was Borland Turbo C++ 3.0 for Windows.)

This reaction of mine has been probably triggered by two facts: (1) reading Coders at work revealed that C++ is strongly disliked by many language designers, and (2) in the past two years I have learned many new languages whose features have often made myself thinking, "Oh, this approach is way better than C++'s!"

So I began thinking of any viable alternative to C++ for my "big" projects, i.e., those projects that are going to be more than a few hundreds SLOCs and therefore would take advantage by a statically-typed language, or projects that need to be really fast and therefore need a compiled language.

Two interesting languages that satisfy both requirements are Ada and D. In the next paragraphs I'll show what I learned about C++'s limitations from my study on both Ada and D.

C++ and its compatibility with C

I've realized that perhaps the biggest problem with C++ is Stroustrup's decision of preserving compatibility with C as much as possible. Although this decision probably offers a partial explanation of the success of C++, it has lead programmers to accept without question limitations and strange features that other modern languages do not have.

The C language was invented "between 1969 and 1973", as Wikipedia says. This means that it is of the same age as Wirth's Pascal, and some of its characteristics have already begun to show their age. Let's see a few examples.

Separation between the interface and the implementation of classes

Consider the declaration of this C++ class (file foo.hpp):

// File foo.hpp
#ifndef FOO_HPP_INCLUDED
#define FOO_HPP_INCLUDED

class Foo {
public:
    int value;
    
    Foo(int aValue) : value(aValue) {}
    void doSomethingFancy();
};

#endif

Any code which needs to use the Foo class should include this header fileusing #include "foo.hpp". Unless the implementation of Foo::doSomethingFancy is not only a few lines long, it should go in a separate .cpp file (foo.cpp):

// File foo.cpp
#include "foo.hpp"

void Foo::doSomethingFancy()
{
    // Do something fancy!
}

What is the problem with this approach? Let's reimplement class Foo in Python:

class Foo:
    value = 0

    def __init__(self, val):
        self.value = val
    
    def getValue(self):
        return self.value
    
    def doSomethingFancy(self):
        ... # Do something fancy!

As you can see, in Python, you only need one file, which defines both the interface and the implementation of the class. In C++, you need both the header (interface) and the .cpp file (implementation): in this way, you have to keep the two files continuously updated. Moreover, if you are looking for the implementation of a function, you cannot know if it has been inlined in the .hpp file or if it is in the .cpp file.

The problem is that this #include stuff is done by the preprocessor (the /usr/bin/cpp executable, a C heritage), and the C++ compiler is completely oblivious of any inclusion when it comes to parse the files (in principle). This leads to the well-known fact that if, after having compiled your program, you modify foo.hpp, recompiling all the files that depend on it, then the compiler will not be aware of this and strange things will happen. You must circumvent this by relying on other software (see the section of the GNU Make about automatic prerequisites).

Of course, this example is not completely fair as Python is interpreted while C++ is compiled. But consider that the guys that designed D (again, a compiled language) were able to devise a module mechanism that is similar to Python — at the expense of breaking compatibility with C/C++, of course. Here is file foo.d:

// Written in the D language

class Foo {
public:
    int value;
    this(int aValue) { value = aValue; }
    void doSomethingFancy()
    {
        // Do something fancy!
    }
}

To use this class in a program, put import foo; at the beginning of your source code. If the test program is testfoo.d, you can build the program using the command dmd -o testfoo testfoo.d foo.d. This is much like C++, where you would write cc -o testfoo testfoo.cpp foo.cpp, but in this case you do not need any header file at all! (Even if D allows for .di files that are similar to C++ header files, their usage is limited, as explained in this forum post: - also, .di files can be automatically generated by the compiler).

What is the situation with Ada? Its approach is somewhere in the middle between C++ and D: modules (in Ada terminology, packages) are split in two files, one (with extension .ads) providing the interface, the other one (with extension .adb) providing the implementation. But, unlike C++, Ada compilers are able to automatically decide which packages are up-to-date and which ones need to be rebuilt, without resorting to any GNU Make magic. (This feature was present in Borland Turbo Pascal's units as well: this is one of the reasons why I am still in love with Pascal.) Moreover, GNAT includes a handy tool, gnatstub, which automatically generates an .adb file from a .ads file (this is the contrary of what D does with .di files, but it fits the way programs are usually developed in Ada — first the interface, then the implementation.)

Multiple declaration of variables

In section 4.9.2 of the C++ programming language (third edition), Stroustrup warns the reader of a potentially confusing way of declaring variables. Consider this example (taken from Stroustrup's):

int* p, y;  // int* p; int y;

It can be potentially confusing, because it it not clear if y is a pointer to int like p or not. Stroustrup says that such constructs should be avoided. But then, one might ask, why are they accepted by the C++ standard? The answer is simple: to preserve compatibility with C.

On the other hand, both D and Ada forbid such constructs.

Implicit conversion between 0 and NULL

A particularly nasty problem with C++ is the use of NULL. Consider what happens if you want to initialize a string to "false", but forget the double quotes:

#include <iostream>
#include <string>

int main(void)
{
    std::string s = false; // What the programmer meant is "false";
    std::cout << s << std::endl;
    return 0;
}

(this is a real example, see this thread on StackOverflow.)

Compiling the program using g++ 4.4.6 does not produce any warning, yet the program crashes:

$ g++ -o stringtest stringtest.cpp
$ ./stringtest
terminate called after throwing an instance of 'std::logic_error'
  what():  basic_string::_S_construct NULL not valid
Aborted (core dumped)

(Note however that g++ 4.7.2 correctly produces the following warning: "warning: converting ‘false’ to pointer type for argument 1 of ‘std::basic_string<_CharT, _Traits, Alloc>::basic_string(constCharT*, const _Alloc&) [with _CharT = char; _Traits = std::char_traits; _Alloc = std::allocator]’ [-Wconversion-null]". Although it suffers from the usual GCC's verbosity, at least the compiler is now able to spot the problem.)

The problem is, the std::string type is not native in C++ but provided by a library. The only string-like type accepted by the raw C++ language is a null-terminated array of ASCII characters. Therefore, std::string is able to automatically convert char * into std::string. The problem is, false is equivalent to 0, which is in turns equivalent to NULL, i.e., a char * pointer. But std::string cannot be initialized to NULL, hence the std::logic_error exception.

Since D has native string types, this kind of errors is less likely to occur. Consider a straightforward translation of the C++ code above into D:

import std.stdio;

void main()
{
    string s = false;
    writeln(s);
}

Compiling it using DMD 2.060 will produce the following error message:

$ dmd stringtest.d
stringtest.d(5): Error: cannot implicitly convert expression (false) of type bool to string

Regarding Ada, this kind of error is impossible as the language avoids almost every implicit typecast. This can be annoying sometimes, but it makes me feel confident of what I'm writing. (It is often said that if your Ada program compiles without errors, then it is probably correct.)

2012-09-07

Discovering Ada

After almost one year since my last post, I am going to write something more regarding "exotic" languages. During the last year I was able to investigate the caracteristics of a well-known, underused language: Ada.

I was interested in it because I found two references to it in the realm of astrophysics: the first one is a spectral line synthesis code for magnetic stellar atmospheres, the second one is a list of satellites which are using software (partly) written in Ada. Also, I always heard of how Ada's typesistem is exceptionally safe and was truly interested in giving it a try.

So here is a list of features of Ada that caught my attention:

  • It is a statically typed language (like C/C++/Fortran/Haskell, unlike Python/Ruby/Scheme). However, unlike C and (to a lesser degree) C++, its type system is strong. This means that the compiler enforces type correctness and will not silently convert e.g. floats to integers.
  • Ada code looks verbose, but very readable. (As I understand, this was one of the driving requirements in the development of the language.)
  • It has a number of interesting features over C and C++, like keyword arguments (called named parameters), nested functions, and some primitive type-inference (e.g. you do not have to declare the type of a variable used in a for loop).
  • It allows code to be split into packages (more on this later).
  • Ada's versatile typesystem allows the programmer to make the compiler doing dimensionality checks. So, e.g., it will signal an inconsistency for code like if obj_speed < field_size, if obj_speed and field_size have been properly declared. Apparently, this feature was considerably extended with the latest version of the language (Ada 2012).
  • It has native support for multitasking. (As far as I know, Ada, Erlang and Go are the only non-academic languages that were designed from the ground up with this capability.)
  • Ada is compiled to machine code: the reference open-source implementation is GNAT, which is part of GCC (GNU compiler collection).
  • Being a tightly integrated component of GCC, it is extremely easy to develop bindings to C/C++ libraries (there is even a tool to do this automatically).
  • Interestingly enough, GNAT is developed by AdaCore, a commercial company which seems to have a quite large user base. It develops both the open-source compiler and a commercial version.

Is Ada outdated?

Before digging into Ada, I had the idea that it was an outdated language with virtually no users today. But I was wrong on both fronts:

  • Ada was born more or less in the same years as C++: work on Ada began in 1976, while Stroustrup's "C with classes" toy language dates back to 1979.
  • There are a lot of Ada users. Only, Ada programmers do not seem to work in the contexts I'm used to.

Is Ada verbose?

I do not like verbosity in general. I have always been amazed by the coinciseness of languages like Haskell: in my opinion, the shorter a program is, the quicker you're able to find problems in it. And, undoubtedly, Ada is quite verbose. A lot of grammar constructs are not strictly necessary for the compiler to understand what the code should do: e.g. the is at the end of procedure/function declaration.

However, I must admit that I've found more than once some source code that was so condensed that it is difficult to understand how it worked &edash; or why it was not working. Readability is probably as important as coinciseness. Ada's designers put a lot of effort in making the language easy to read, even to people that have never learned Ada.

Types, types and types

Ada's strongest advantage is probably its versatile typesystem. You can define "subtypes" which optimally limit the range of values of a primitive type (e.g. an integer which can hold values between 18 and 28): Ada will automatically add bounds checking code. (If you do not limit the range, your subtype works like typedefs in C.)

However, the most interesting feature is the ability to define new types from primitives. In this case you are not allowed to mix the new type with its primitive, unless you explicitly tell to compiler to allow you. So, you can e.g. create three new types distance, time, and speed from the primitive type float: you will not be allowed to combine (e.g. add/multiply) variables of different type, unless you manually override compiler's checks or redefine operators (like C++). This allows to check for measure unit's consistency in the code. (Such a feature is possible in C++, at the expense however of writing a lot of bolierplate code: basically, you have to define a class which wraps a float and implement manually every operation you need on them; on the contrary, Ada already knows how to sum two distance variables, as they behave the same as floats.)

Packages

I am not happy about how C/C++ programs are split into multiple files. You usually separate the classes and functions in different .cpp files, and in each of them you include a .h file which provide the class/function definition. However, the compiler is unaware of the difference between .h and .cpp files: the difference is relevant only to the C preprocessor.

Compare this with Ada packages. You have to write two files, as in C++: one with extension .ads (the specification file, analogous to .h files) and .adb (the implementation file, analogous to .cpp files). However, the Ada language allows you to specify what of the package has to be exported and what is meant to be private. This is similar to the concept of private/public methods in C++ classes, but it works at file level. It is much similar to units in Turbo Pascal, and it is much more effective. (Also, it allows the GNAT compiler to recompile outdated dependencies without the need of a Makefile.)

So, how fast is it?

I was particularly ingrigued by the fact that the GNAT Ada compiler is integrated into GCC. This allows Ada code to be optimized by the same machinery GCC employs for C/C++/Fortran/ObjC code, and it can in principle guarantee the same performance. I was however puzzled by the results of the Computer Language Benchmarks Game, which clearly showed that on average C++ code required less memory and in some cases much less time to run (here is the GNAT/g++ comparison, and here the GNAT/gcc comparison, which is even worse for GNAT).

So I investigated a bit why this difference. I picked the k-nucleotide example, for which the fastest C code is available at this link. I found that, even if Ada is considerably slower than C, it still ranks third. In my opinion, this indicates that the C program has been dramatically optimized, not that Ada is inefficient per se.

As a side note, I found that C/C++ programs in the Shootout are sometimes so optimized that it is difficult to understand what they are doing: look at this implementation of mandelbrot, expecially the function calcrow: do you now what __builtin_ia32_cmplepd and __builtin_ia32_movmskpd are supposed to do without reading the comments? Given the large number of C/C++ programmers, I bet that it's easier to find a relatively large subset which is good in low-level optimizations and assembly coding: this might explain why C/C++ Shootout programs often perform better than Fortran and Ada.

2011-10-14

Floating-point reference parameters in C++

Today I had some time to verify an idea I have always had: in C++ it is better to avoid unnecessary references when passing double arguments to functions.

The idea to verify this came while I was inspecting some C++ code. One of the functions had a parameter which was declared as double & despite the fact that such parameter was never changed in the function body. Consider this example:

The problem with the double & lies in the fact that the function receives a pointer to double instead of a double: therefore it has to look for the value pointed by the argument and copy it into a floating-point register:

Had we defined doubleThis with a double parameter (without the & indicating a reference) like this:

then GCC would have produced the following assembly code:

which is shorter and faster.

In short: avoid using unnecessary references. Apart from the fact that they obfuscate the meaning of the function (a reference parameter indicates that the function is going to alter its contents, while this is not true for doubleThis), depending on the type of the parameter they might also produce slower code.

2011-02-22

Experiments with Racket

In the last weeks I have played with Racket (formerly PLT Scheme), a powerful language derived from Scheme. Here I describe a couple of problems I solved using Racket, in order to underline the difference between this language and Python.

First, let me say that I am really impressed by the quality of the implementation of Racket. It is fast (apart from the execution times for the GUI), it has a huge set of libraries, it is easy to use and it has a number of nice facilities.

Project Euler again!

The first script I wrote was to solve one of the easiest problems proposed in the Project Euler site, that is problem 15. This kind of problem can be solved using a divide-and-conquer algorithm, so here comes my first attempt:

(define (num-of-solutions width height)
  (+ (if (> width 1)
         (num-of-solutions (sub1 width) height)
         1)
     (if (> height 1)
         (num-of-solutions width (sub1 height))
         1)))

Such solution relies on the fact that at any point in the grid one can only move left or down (provided that there is room to move in that direction), and that once you make a move you are trying to solve the same problem for a rectangular grid with a smaller size.

This solution however has a problem: when running num-of-solutions on a 20x20 grid, the function takes forever to evaluate. The problem lies in the fact that each non-trivial call to num-of-solutions spawns two recursive calls to itself, so that the number of calls grows quickly.

There is a simple solution: since many of the calls share the same parameters (and hence return the same number), we can use a technique called memoization: when calling num-of-solutions with some parameters for width and height, save the return value in a table so that subsequent calls to the function with the same parameter will simply look up the table instead of re-running the function. (See SICP for an example applied to the calculation of Fibonacci's numbers, a problem stunningly similar to the one we are solving here.)

In order to memoize num-of-solution I might have relied on standard techniques available for the Scheme language, but I instead had a look in the excellent Racket documentation and in PLaneT (the collection of separate packages available for Racket). In a few seconds I found the memoize package, which does exactly what I wanted: just changing define into define/memo will make num-of-solution orders of magnitude faster:

(require (planet dherman/memoize:3:1))

(define/memo (num-of-solutions width height)
  (+ (if (> width 1)
         (num-of-solutions (sub1 width) height)
         1)
     (if (> height 1)
         (num-of-solutions width (sub1 height))
         1)))

(num-of-solutions 20 20)

And here comes a nice touch. I did not know how to install this additional package in Racket, so I wrote the example above in the IDE and compiled it, hoping that any error message would help me to understand what to do. Guess what? The IDE automatically connected to PLanetT, downloaded and installed the package and continued the execution: in a few seconds I had the answer! (Compare this with Python: it is like if any time you import some library not available the interpreter downloads and installs it automatically. How nice!)

A more complex example

The second program I wrote was considerably longer. I had a problem with a TWiki site I am administering: every user would attach huge PDF files to wiki pages instead of properly using a separate site (I shall not bother you with details about how this site works: let just assume it is a FTP site). This caused any backup of the TWiki site to be quite huge, so I decided to fix this by manually moving the biggest PDF files to the FTP site and then change each link to the new place. As there are potentially thousands of links to fix, this clearly required an automated solution. And here comes Racket.

My plan was to download each page containing huge attachments in a text file, search for the TWiki markup signalling a link to any file already moved to the FTP site and modify the link properly. The caveats for such task are:

  1. I am not going to move all the files, just the biggest ones. So the program should be smart enough to decide which links to change and which not.
  2. So far I have considered the remote site as a FTP site, but this is not really the case. Specifically, to get a file from there the URL must contain the file name and a unique integer ID. So links to this site cannot be created automatically, they must be specified one by one. An example might be http:/foo/bar/filename.pdf?node=1234. Thankfully the file name is always present!

The program I created relies heavily on Racket's regular expressions, an addition of Racket over the Scheme language (note that regexps are native in Racket – like Ruby but unlike Python, which requires you to use import re at the beginning of your script). It takes two input files: the first one contains the wiki page, the second one contains a list of links to the remote site. The code extracts bare file names from the links and produces an associative list (similar to dictionaries in Python), which then uses to fix links in the wiki page.

The most difficult thing for me was to think how to implement iterations using recursion instead of for loops (and such code has a lot of iterations: looping over links, looping over markups…): however, I am willing to master this technique as the use of recursion is common in functional languages.

So far the code has worked like a charm. I have not benchmarked it, but it works so fast that execution times are negligible (of course I created a compiled executable in order not to start Racket every time I need to use it).

2010-10-19

Functional programming

Recently I had to work with an old numerical software written in Fortran77. Such program simulates the microwave radiation pattern of an emitter (like an horn or a waveguide). The program requires several minutes to compute a radiation pattern for all but the simplest configurations, so I started wondering how the computation might be accelerated (despite the fact that, being the software closed-source, the only way to go would be to rewrite it from scratch).

One way to get a significative speedup is to take advantage of multi-core architectures, as this seems so far the only way to ensure that performances will scale in the future. As it is well known, parallelizing an algorithm is not generally a trivial task. Rather, it often involves rethinking the problem from the ground up.

As parallelizing C/C++/Fortran code requires using either OpenMP (easy, but not feasible for complex tasks) or MPI (many features, but difficult to use), I started investigating some "new" approaches to parallelization. Then I found… functional programming!

Functional programming (opposed to imperative programming) is a way to write programs that concentrates on functions which work on immutable values. The point of having immutable values helps parallelization, as one does not need locks, mutexes and semaphores: this explains why, even if functional programming is a quite old concept, it has gained a renewed interest in the last few years. Microsoft is pushing its F# language (based on the very nice family of Caml languages), and other languages have been designed with multicore CPUs in mind. E.g. this post compares Clojure (see below) with Python with an explicit note on concurrency/parallelism (something Python is not very good at, see also the follow-up).

In the last weeks I started gathering some information about the most widely used functional languages. I started with OCaml, as it seems one of the oldest and widely-used languages. Although the INRIA compiler produces really fast code, it seems that some fundamental problems with the garbage collector prevent OCaml programs from taking benefit of multicore CPUs. Although there are attempts to solve this, OCaml does not seem ready for multicore yet. So I looked for another language to study. Haskell enjoys a very large community, but I do not really like it (but it seems to support concurrency pretty well…), and I avoided Erlang too (no specific reason, maybe it is its syntax that scares me).

Among the most interesting functional languages of the new generation, Scala and Clojure have caught my attention. Both run using the Java Virtual Machine (which, according to your taste, might or not be an advantage) and have a similar objective (exploiting functional programming for easy parallelization, see e.g. this page about Clojure), although they try to accomplish this using two different approaches: Scala is clearly derived from Java and ML, while Clojure is a simpler (and nicer) LISP. Both are moving targets (both languages seem to change pretty fast with each release) but already have a remarkable community.

I am studying Clojure with great interest, as it reminds me of my old days with Scheme. Although it seems that Scala often beats Clojure in performance (see The Computer Language Benchmarks Game, but take their results with a grain of salt), I find the syntax of the latter easier to understand. More details to come with the next posts!