java - Search for specific node in Abstract Syntax Tree -


I am trying to find specific nodes in AST (abstract syntax tree). The basic idea is:

  • An AST has been parsed from the source code, which has approximately 10000 nodes.
  • There is a list of 50 items, which I would like to find. AST

    Question: What is the best way to search for those 50 items in AST?

    Right now, I'm thinking of using the array list of those 50 items. Then, compare to Arrayel by going to AST and using each node loop. Is this a good idea in the period of performance? I want to make the operation faster, is there any other way to solve this problem?

    I will not use an arithmetist because it requires you to scan it every time and this It's just overhead. You can easily write 50 predictions as "P1 or P2 or ...."

    You can either search the tree once, to determine if you have an interesting node, or 50 times the tree, apply a point on each separate pass In, you have to run predictions, so they do not change the cost (note below) both ways.

    If you search once, you will need to answer the requirements of 49 or more "or" together, so the extra cost is 49 * [cost of OR] [ If you have an extra cost of 49 in the search-50, then <[cost to go to tree node] * [number of nodes] Therefore, the question is "or" the cost of "tree node tour" Is less than or not. "Or" Most machines are very fast, because it only uses registers and prices only in the cache. Going to a tree node can be very fast, but it is likely to have many instructions; Worse, it touches memory if your tree is not large enough to fit in the cache, then your search -50 cost depends on memory access time, if the future is cheap.

    Now, we can "cheat" in some interesting way. First of all, it can be predicted that there are some relations; If prediction A means BB, I can check B. First, and if it is false, then I do not have to test for A. It can cut the number of "or", but does not help with tree trips. Secondly, it may be that stock predicates subtests, for example, in fact "A1 and A2" A, and B. actually "a1 and a2"; In this case, you can make predictions and evaluate short-term sub-predictions; You need to evaluate the copy of the node once "A1". This scan is not so easy with multiple techniques. It may be that some unsuccessful failure means that sub-custom is not required; Here the number of searches can increase rapidly, because every search can only examine the required sub-years, where search-once search will be required under a node, which agree all That's a stop point.

    However, it is likely that for every prediction, your program wants to respond differently. So your program structure is actually a group of "if p1 (node) then a1 (node)". If the future is cheaper and triggered with a relatively high frequency, then the tasks are likely to be a dominant cost (more expensive than navigating tree nodes), and then it will be okay in either the case of technology performance.

    After all, if predictions and actions are complex, then you can guess which is cheap, easily. Well, both code searches ( that is not difficult) and to measure it on realistic data

Comments

Popular posts from this blog

Pass DB Connection parameters to a Kettle a.k.a PDI table Input step dynamically from Excel -

multithreading - PhantomJS-Node in a for Loop -

c++ - MATLAB .m file to .mex file using Matlab Compiler -