-
Notifications
You must be signed in to change notification settings - Fork 4
Initial implementation of the Monkey language #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Leaving some open questions here.
// a HashLiteral isn't a valid expression as a key in a monkey hash. | ||
impl Hash for HashLiteral { | ||
fn hash<H: Hasher>(&self, _state: &mut H) { | ||
panic!("hash not implemented for HashLiteral"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this bad form to panic on a method implemented for a trait? This should never get called.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If executing this code should be a bug (the programmer's fault) it's usually recommended to use unreachable!()
to signal that this code should never be executed.
(In many cases, this also allows the compiler to do more optimizations, also when using zero-sized types like Never
which can never be instantiated.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
pub struct IfExpression { | ||
pub condition: Expression, | ||
pub consequence: BlockStatement, | ||
pub alternative: Option<BlockStatement>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had some confusion about some spots where it seems like you can use Option in lieu of using a Box. Is that true, what's going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can use Option
for any type that has a fixed size (that is Sized
), e.g. structs, enums. Box
is usually used for trait objects (as in Box<dyn Trait>
when the trait is "object-safe" (not all traits can be turned into trait objects)) or sometimes boxed slices (Box<[T]>
because while a reference to a slice is Sized (it's a fat pointer that also stores the len), a slice itself is not Sized).
Btw, a reference to a trait object is also a fat pointer (pointer to the instance and a pointer to the vtable of its trait impl).
The doc has a detailed explanation of the rules for when a trait is object-safe (sorry, I'm on my phone right now).
Box of course also differs from Option in that it represents a definitely existing T instead of an optional T, and thus automatically dereferences to its content (via the Deref
trait).
Btw, I just read your blog post and I'm interested in helping you on your journey of learning Rust and using it in production :)
You mentioned that you're also interested in gRPC and web servers. There are several crates for gRPC (when I tried it there was only one but it was working well) and for Web servers I've been using rocket in production (before actix-web existed) and now also actix-web (because it's faster / async (and it allows serving websockets on the same port)).
There are also some other async web server frameworks like tower-web but they are in early stages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another +1 on actix/actix-web, I'm surprised at just how solid it is and it seems to be taking over in any place I see such programming styles being used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Boscop thanks, that's great info. I'll have to read more up on trait objects. Didn't really get a chance to utilize traits in this code, but it's coming up for some future hacking. I'll search through the gRPC crates. I saw in some thread criticism about Rocket because it's not async so I'll probably dig into actix first.
input: &'a str, | ||
expected: i64, | ||
} | ||
let tests = vec![ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the table test idiom used by Rustaceans or is this frowned upon?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah it is common pattern used, you can even look into quickcheck & proptest which are really popular libraries that helps doing those table tests in some "more powerful" ways.
Builtin(Builtin), | ||
Array(Rc<Array>), | ||
Hash(Rc<MonkeyHash>), | ||
Null, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Null object is referenced everywhere and I always create a new one when used. In the book a constant was used. I couldn't quite figure out how to do this in basic Rust. Would you use something like the lazy_static
crate to define constant objects?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rust represents enum variants of unit type (like Null) very efficiently, so creating them over and over is fine.
In this example, creating the unit variant compiles down to a single assembly instruction (mov [r14], 2
). While the exact memory layout of an enum is an implementation detail, I don't have any reason to believe that it would ever get much more expensive than that.
} | ||
|
||
impl Object { | ||
pub fn inspect(&self) -> String { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The book used the inspect method, but I think if I were more idiomatic I would just implement the Display
trait.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Display
is useful for something displayable in a user-formatted way. If you want a low level detail representation then look at the Debug
trait, that seems closer to what inspect
is from what I see about what inspect
is supposed to be?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think that's right. I would imagine in a more complete implementation of a language I would want to implement both (not as derived, but specific to those objects)
impl Builtin { | ||
pub fn lookup(name: &str) -> Option<Object> { | ||
match name { | ||
"len" => Some(Object::Builtin(Builtin::Len)), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here as with Object::Null
, would be nice to re-use a constant of some sort. How to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You only need lazy_static
if the value can't be computed at compile-time.
Btw, the range of expressions that can be computed at compile-time is being expanded (through const fn
, similar to const-expr in C++).
For this case, you could just define a non-lazy (normal) static or const:
const BUILTIN_LEN = Object::Builtin(Builtin::Len);
The difference between a const and a static is that consts don't have a memory location, they are inlined at every use location, so it's not recommended to use const for large arrays, they will be copied onto the stack at the use locations and can blow the stack.
pub struct Function { | ||
pub parameters: Vec<ast::IdentifierExpression>, | ||
pub body: ast::BlockStatement, | ||
pub env: Rc<RefCell<Environment>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the spot where I end up creating a memory leak since there are circular references. I'll log an issue to talk about what to do about this since it's a longer thing. One simple question though: I've seen Rc<RefCell<...
and I've seen RefCell<Rc<...
. Is there a difference between the two? Which is "more correct"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For avoiding memory leaks caused by cycles in the interpreted code, this gc library seems like a good fit (but it's in an early stage):
https://www.reddit.com/r/rust/comments/9ozaut/shifgrethor_i_garbage_collection_as_a_rust_library/
RefCell is like Cell (allows mutation) except it works for types that aren't Copy
. Rc wraps it in a reference counted "box", so if you want to be able to mutate the same object from different (non-concurrent) code paths (in the same thread), you'd usually use Rc<RefCell<T>>
, e.g. with the job_scheduler crate if you want to be able to mutate an object from different jobs (that are not run in parallel).
It's used when the borrow checker won't let you do it without this, because it can't prove that the mutable borrows don't overlap in time, but you know it for sure (RefCell will panic at runtime if a second mutable borrow happens while the first one is still active).
This pattern is basically runtime borrow checking. It should only be used when compile-time borrow checking doesn't suffice (because there is the (small) ref counting overhead and it's better to prove the absence of a bug at compile-time than to check for it at runtime).
This pattern is also used heavily in the nphysics engine (the last time I checked) because it has to store a lot of cross references that allow mutating the objects.
AFAIR, I've never used the second way (RefCell<Rc<...
) (since I started using Rust in 2014), where did you see that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought I had seen it the second way somewhere, but now I can't seem to dig up where I saw this. So maybe just misremembering all the different things I read on these.
Ok(left_exp) | ||
} | ||
|
||
fn prefix_fn(&mut self) -> Option<PrefixFn> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way I did this felt kind of odd. He had in the book a global map that had a string key to match the token to the associated function used to parse. I got the same effect with this match and passing functions, but it felt a little weird. Is there a better way to do this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is basically the more type-safe version of that. Instead of looking up the function by the token's string repr, you're looking it up by the token variant. I don't think this is weird but I'd probably manually inline infix_fn
at its call location to make it more apparent / readable what its purpose is (so that the person reading this code doesn't have to jump around), also because the infix_fn
is short and only used at this one location.
So I'd write let infix_fn = match { ...
and then match infix_fn { ...
.
|
||
type ParseError = String; | ||
type ParseErrors = Vec<ParseError>; | ||
pub type ParseResult<T> = Result<T, ParseError>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error handling examples in both The Rust Programming Language and Programming Rust used just Return
as a name. But when I tried to do that I got some weird error about cycles. Is that not allowed anymore and is it now the convention to do <Thing>Result
and <Thing>Error
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume you mean that you tried pub type Result<T> = Result<T, ParseError>;
? This doesn't work because the compiler thinks you are defining the type alias in terms of itself, hence the cycle detected
error (probably not the best error message).
Try using the fully qualified name instead:
pub type Result<T> = ::std::result::Result<T, ParseError>;
I'm opening this so I can leave questions on some lines for specific things. Hopefully I can come back to these and answer them!