|
Post by sdlbasic on Sept 27, 2016 7:59:32 GMT -6
Hey noob, you could have a thread about dev rc_basic, what do you think? I say this cause i love this kind of subjects and read the rc_code also.
I was reading the code and found this in the first line of the file:
#define RC_WINDOWS
you could use the macros built in like
#ifdef _WIN64 //define something for Windows (64-bit) #elif _WIN32 //define something for Windows (32-bit) #elif __linux // linux #elif __unix // all unices not caught above // Unix #elif __posix // POSIX #endif
Example in one of yours function
inline string rc_intern_OS() { #if defined(_WIN32) || defined(_WIN64) return "WINDOWS"; #endif }
and an extra one to get the version on the windows systems
string rc_intern_OS_VERSION() { #if defined(_WIN32) || defined(_WIN64) OSVERSIONINFO osvi; ZeroMemory(&osvi, sizeof(OSVERSIONINFO)); osvi.dwOSVersionInfoSize = sizeof(OSVERSIONINFO); GetVersionEx(&osvi); return to_string(osvi.dwMajorVersion) + '.' + to_string(osvi.dwMinorVersion) + '.' + to_string(osvi.dwBuildNumber); #endif }
If you do not want or this is not a good place just delete it :-).
|
|
|
Post by n00b on Sept 27, 2016 13:22:26 GMT -6
I think that this is a good idea. I am making this thread a sticky. I have actually been in the planning stages for rcbasic 3.0 for a while now. As far as the defines go I did it the way I did because I just did not know the actual compiler defines at the time. I have been planning on rewriting a majority of the compiler and improving the runtime as much as possible for the 3.0 release. Here are some plans I have for it right now.
Plans for 3.0 *Rewrite compiler for improved compiler speed and stability *Reducing the size of the opcodes used by the runtime *More improvements to the graphics library *Developing a code compatible interpreter to make it easier to test large programs
Here is a quick example of what I mean by reducing the size of opcodes. Lets say you had the following line.
5 + 3
The compiler would take this line and produce the following vm assembly code.
mov m0 5 mov m1 3 add m0 m1
In the above code 5 is stored in m0 and 3 is stored in m1. Then the last line adds m0 and m1 and stores the result in m0. Its pretty efficient and gets the job done. But the problem is what the above code compiles to.
To demonstrate the probmlem I will just go over what the first line compiles to.
This is the first line from the previous example.
mov m0 5
And this is what it compiles to.
32 1 0 0 5
In the above line the number 32 is the mov instruction. The 1 is saying you want to move a number into one of the number registers (ie. m0, m1, m2, etc.) The first 0 is the number of the register, which in this case is m0. The second 0 is saying that you are moving a raw number into the register. And the 5 is the raw number that it is moving into the register.
I didn't realize how much of a problem that the way I implemented this was until I started working on the runtime. I ended up having to check to see if the bit after the move instruction was a 0 then treat the number as a raw number, if the bit was 1 then treat the number as an index to an array rc_vm_m[] which is an array in the runtime that holds the values of the m registers, or treat the number as a string, etc.
That is a lot of if statements to go through for each instruction. That is why I am rewriting it to where there will be a different opcode for each of these scenarios which should make the runtime much faster.
The idea of building the interpreter is more inspiration from sdlBasic. I like how sdlBasic can execute code on the fly without the ridiculous compile times that rcbasic currently suffers from. My game engine in its current state takes almost 10 minutes to compile. I feel an interpreter would make it more appealing for programmers who like to test there code every time they make a change.
Lastly, I am thinking about abandoning the editor completely. The editor is project in itself. And I feel that I would be better off providing support for another editor through plugins rather than trying to continue with building my own.
|
|
|
Post by n00b on Sept 27, 2016 13:30:44 GMT -6
Here is the rcbasic internal design doc. The codes for the built-in functions is mostly incomplete but the internal structure stuff is and it shows how the vm works. VM.ods (22.96 KB)
|
|
|
Post by sdlbasic on Sept 27, 2016 14:48:24 GMT -6
Why not cutting the middle man :-), to sum 2 in asm for example you mov 3 to reg0 and then you direct add the new value to reg0 and then pop the result to another reg or var like:
MOV @0 3 ADD @0 2 MOV result @0
this way you cut one operation.
where is the vm code? Losing the ide is not a bad idea, 3 parties could built one to the compiler, you should provide maybe one ide pre-edited to function with rc basic like sdlbasic and scite or provide a way to config one.
In code i also saw that you use inline in some function, this is to trying to speed some more important code but this depend from compiler to compiler, also when they are big functions this may not work.
|
|
|
Post by n00b on Sept 27, 2016 15:50:36 GMT -6
I originally tried to have the final code more optimized but wasn't able to get it to work the way I wanted for some reason. But since I am going back and rewriting almost all the code for the compiler it will give me a chance to optimize the code emitted from the compiler.
I also didn't know to much about how GCC determines if a function will be inline or not when I started. I actually started working on rcbasic back in early 2014 and before that I had not done any programming since the old sdlBasic forum that was moderated by kveroneau was still up (I might have misspelled his name). I also had to learn SDL2 which took some time. I think I now know enough to seriously improve the compiler and runtime.
I should be able to finish an alpha version of 3.0 by the end of the year or possibly early next year. I am not really changing anything that is currently in rcbasic but just improving the speed, efficiency, and stability of the compiler and runtime. I also want to build an interpreter that uses the parser of the compiler but executes code on its own to make testing a lot faster. So in my design right now I am planning on building a library specifically for parsing and generating tokens that can be used for both the compiler and interpreter. I will upload a more indepth design document as I get a better idea of how all this will work.
Another thing I want to improve in 3.0 is the code itself. I now realize how unreadable of a mess rcbasics code is right now. I am definitely going to improve that by commenting more, organizing code better, and having less code in each function.
I am still not sure if I should be implement runtime error checking for stuff like arrays out of bounds. I think sdlBasic does this but I originally chose not to do it because I was trying to achieve the fastest execution time possible. But I don't think that extra if statement would slow it down too much so maybe it would be a good idea.
|
|
|
Post by sdlbasic on Sept 27, 2016 16:07:49 GMT -6
That sounds good, i have a small cal interpreter that i build in c and what iut does is, gets the input from stdin and lexes and parses the code, i chose to build separeted files with header files this way it is a lot better to undestand and each file has it owns functions, if you like i could sent it to you to see how it works, nothing fancy do. Need any help for improvements suggestions or whatever in the subject_
Thanks, this is the kind of chalange i like.
|
|
|
Post by n00b on Sept 28, 2016 1:05:44 GMT -6
If you don't mind sharing your interpreter code it would help me learn alot about how to efficiently execute code. I wrote a interpreter for a custom BASIC like language in Java back when I was in high school but I have not tried to write a actual interpreter since then. I want to learn as much as possible while developing 3.0. As for the editor, I think I am going to use geany and include a syntax file and scripts to run rcbasic code with it. I am also going to improve the docs more as it is still bare bones. If you could help me get the interpreter running it would really help in comparing interpreted output to compiler output. If you could help just let me know. I am also thinking of changing the name to something easier to find in search results. I am not sure what just yet.
|
|
|
Post by sdlbasic on Sept 28, 2016 3:52:14 GMT -6
If you don't mind sharing your interpreter code it would help me learn alot about how to efficiently execute code. I wrote a interpreter for a custom BASIC like language in Java back when I was in high school but I have not tried to write a actual interpreter since then. I want to learn as much as possible while developing 3.0. As for the editor, I think I am going to use geany and include a syntax file and scripts to run rcbasic code with it. I am also going to improve the docs more as it is still bare bones. If you could help me get the interpreter running it would really help in comparing interpreted output to compiler output. If you could help just let me know. I am also thinking of changing the name to something easier to find in search results. I am not sure what just yet. simple_calc.zip (3.1 KB) Above is the code that o talk about, it's a simple calculator and it's not finished but it is organized in a way that you can see the tokens and the parser, I'm glad to help out, let me know what to do. Thanks.
|
|
|
Post by sdlbasic on Sept 28, 2016 4:33:49 GMT -6
If you do not mind i will make some changes to some of the functions you have implemented to trying to be more readable and improve, if you do not want them here please delete them and i will stop. This function converts strings to uppercase, this is your implementation, in this case you are constant evaluating the size of the length in the for loop, you can improve this by adding a size_t i = string.length(); this way it's only evaluated once and outsize the loop, i made a diferent one.
string StringToUpper(string strToConvert) {//change each element of the string to upper case for(unsigned int i=0;i<strToConvert.length();i++) { strToConvert[i] = toupper(strToConvert[i]); } return strToConvert;//return the converted string }
this is my implementation, in the while loop i make the assign while it's different '\0'-> end of string and only increments the i, and returning the upper string. the same could be done to the lower.
string StringToUpper(string strToConvert) { size_t i = 0; while((strToConvert[i] = toupper(strToConvert[i])) != '\0') i++; return strToConvert; }
|
|
|
Post by eyfenna on Sept 28, 2016 9:33:40 GMT -6
I'm looking forward to see rcbasic 3.0
(omg assembler code ... brings back memories)
|
|
|
Post by n00b on Sept 28, 2016 14:50:04 GMT -6
If you do not mind i will make some changes to some of the functions you have implemented to trying to be more readable and improve, if you do not want them here please delete them and i will stop. This function converts strings to uppercase, this is your implementation, in this case you are constant evaluating the size of the length in the for loop, you can improve this by adding a size_t i = string.length(); this way it's only evaluated once and outsize the loop, i made a diferent one. string StringToUpper(string strToConvert) {//change each element of the string to upper case for(unsigned int i=0;i<strToConvert.length();i++) { strToConvert[i] = toupper(strToConvert[i]); } return strToConvert;//return the converted string }
this is my implementation, in the while loop i make the assign while it's different '\0'-> end of string and only increments the i, and returning the upper string. the same could be done to the lower. string StringToUpper(string strToConvert) { size_t i = 0; while((strToConvert[i] = toupper(strToConvert[i])) != '\0') i++; return strToConvert; }
Thanks for looking at this. Anything with less code is always better in my opinion so I will definitely be able to use this. I am getting ready to look into the code for your interpreter now. I am going to start a new repository for this since it will be mostly new code for the compiler along with the interpreter. After I get it up I will post a link to it here.
|
|
|
Post by sdlbasic on Sept 28, 2016 15:10:12 GMT -6
No problem, just let me here the link and i see what i can help out, this is part of what we could do to optimize the code, the code that i left here is not nothing compared to rc basic, this is just a starting point. Let me know what you need me to do cause our time zone is way different, 22:10 here now :-).
|
|
|
Post by n00b on Sept 28, 2016 16:05:27 GMT -6
Here is the link to the repository. Right now there is only a readme file. I also sent you a collaborator request on github. RCBASIC3I think a good place to start is with the parser and tokenizer library. This is how the RCBASIC tokenizer works right now. Generates these tokens:*Numbers *Strings *Operators - This will also include special cases to test for like when more than 1 operator is used as a single operator (ex. >=, <>, <=, etc.) *Keywords - Needs to check a if a line item is a keyword *Anything else can be considered an identifier that can be checked against functions and variables *NOTE: Need to implement built-in HEX and BINARY recognition *NOTE: If possible, built-in constants like keyboard codes and boolean values should be evaluated in this step The parser generates vm assembler based on the tokenized expression. I think it would be a good idea to keep these as two separate steps rather than a single step. If you think it would be more efficient to do these simultaneously let me know. Also something I want to have set up from the beginning this time is a test environment. I actually did not have any real way to test the output last time other than manually feed expressions to the tokenizer and then read each token and opcode. That was the dumbest thing I ever did. Unless there is a better way of writing tokens, I would like to keep those the same also. Here are what tokens look like in rcbasic right now. <num> number <string> text <id> identifer <par> </par> <curly> </curly> <square> </square> <add> <sub> <mul> <div> <equal> <greater_equal> <less_equal> <not_equal> <greater> <less> <mod> <and> <or> <xor> <not> <keyword> --- each keyword was its own token After the tokenizer step is done, the compiler and the interpreter will probably not share any more code to perform any other task. I know it probably seems pointless to rewrite the tokenizer when it works just fine but I think that a lot of things could be done better. If you could start on a function that just tokenizes a math expression with these tokens, I will start writing a new parser to read the tokens and generate opcodes. Then we can set up some kind of test environment to test complex expressions and try to crash it. The parser was one of the biggest flaws in rcbasic currently. It currently deals with parenthesis by evaluating the last parenthesis it encounters and continues doing this until the line is done. It will give you the correct result for a math expression but it can cause some weird output on a line that is using functions for input or output in its expressions. Try the following code to see what I mean. It evaluates the second parenthesis first, then it determines that it is the argument for a function and executes the function. Then it does the first one. c$ is going to be equal to correct value in the end but it does it out of order. c$ = Input$("Text 1:") + " " + Input$("Text 2:") Print "c$ = " + c$ I will also start a new opcode sheet and document all the opcodes.
|
|
|
Post by sdlbasic on Sept 29, 2016 5:56:39 GMT -6
If you could start on a function that just tokenizes a math expression with these tokens... Is this going to be in c++ or c? Do you have a template that you want me to fallow or i just based in the uploaded code i had for the calculator?
|
|
|
Post by sdlbasic on Sept 29, 2016 9:33:45 GMT -6
I stated to work on some functions to help out, this is what i got, this can be ported to c++ easy, maybe not even needed.
/* <num> number <string> text <id> identifer <par> </par> <curly> </curly> <square> </square> <add> <sub> <mul> <div> <equal> <greater_equal> <less_equal> <not_equal> <greater> <less> <mod> <and> <or> <xor> <not> */
/*some helper functions*/ int isAlpha(char c) { return (c >= 'a' && c <= 'z' || c >= 'A' && c <= 'Z'); }
int isDigit(char c) { return (c >= '0' && c <= '9'); }
int isAddOp(char c) { return (c == '+' || c == '-'); }
int isMultiOp(char c) { if (c == '*' || c == '/'); return 1; else return 0; }
int isWhite(char c) { if (c == ' ' || c == '\t') return 1; else return 0; }
|
|