Ooh Brian you're so unfunctional today... kiddin :-)
It is indeed a nice exercise. Unfortunately it can't be done with regexes because your language is context free (more general than a regular language). It requires a stack machine to be parsed. So here we go with the state machine implementation... The stack state is given by the 'nesting' variable.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
open System let newToken = "" let eatChar (nesting, (currentToken : String), (prevTokens : String list)) currentChar = match currentChar with (* Split a token on spaces only if it is at top level *) | ' ' -> if nesting = 0 then (0, newToken, List.append prevTokens [currentToken]) else (nesting, currentToken + currentChar.ToString(), prevTokens) | '(' -> if nesting = 0 then (1, newToken, List.append prevTokens [currentToken]) else (nesting + 1, currentToken + currentChar.ToString(), prevTokens) | ')' -> if nesting = 1 then (0, newToken, List.append prevTokens [currentToken]) else (nesting - 1, currentToken + currentChar.ToString(), prevTokens) | _ -> (nesting, currentToken + currentChar.ToString(), prevTokens) let parse (string : String) = let (_, finalToken, tokens) = Seq.fold eatChar (0, "", []) string List.append tokens [finalToken] |> List.map (fun s -> s.Trim()) |> List.filter (String.length >> (<) 0)
It doesn't fully respect your example because it removes () from the generated tokens. I'm guessing it would still be ok. It's easy to extend to generate a tree of tokens (full parsing).
1 2 3
> parse "(alpha (brown) (charlie delta)) echo (foxtrot golf)";; val it : string list = ["alpha (brown) (charlie delta)"; "echo"; "foxtrot golf"]
Mau
-Edited to correct an issue with spaces.-
Thanks Brian and Mau.
You guys are really good although I'm bit disappointed that it's not using regex. I thought regex is a magical tool that can deal with everything. I didn't know it can't be done with regex (no wonder why it's so hard to find a right pattern :P).
Hey Alex, if you knew that you were going to have at most n of levels of nesting, then you could use a regex. But I think it would grow ugly pretty quick as n increases.
I see. I thought it could be done with a recursive function that might be able to handle nested brackets. How about generate regex patterns dynamically? Say evaluate the string once so find out the depth of nested brackets (I guess this one would be really simple and quick.) and then pragmatically produce right regex patterns for it... Does this sound crazy? :P
Thanks for your explanation and the source codes.
Regards,
> I see. I thought it could be done with a recursive function that might be able to handle nested brackets.
It can. The recursion gives you the stack.
> How about generate regex patterns dynamically? Say evaluate the string once so find out the depth of nested brackets (I guess this one would be really simple and quick.)
> and then pragmatically produce right regex patterns for it... Does this sound crazy? :P
Well if you scan the input to count brackets you may as well tokenize while you're at it.
I'd like to see your dyamic RE idea implemented though! :-)
anyone think about using FParsec for this kind of thing ?
The other useful one I used in Lua(when I need to write a json parser) is PEG(i.e. LPEG in lua). I found it to be very powerful and seems to be very fast too.
I hacked this because it was fun, but I don't necessarily claim that the code is good or correct.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42
type Data = | Atom of string | List of Data list let parse (input:string) = let max = input.Length let rec skipWS i = if i < max && input.[ i] = ' ' then skipWS (i+1) else i let rec parseList n = assert(input.[ n] = '(') let mutable i = skipWS(n+1) let l = new ResizeArray<_>() while i < max && input.[ i] <> ')' do let data, j = parseData i l.Add(data) i <- j if i=max then failwith "bad parse - missing )" assert(input.[ i] = ')') List(Seq.toList l), i+1 and parseAtom n = let mutable i = skipWS(n) if i=max then failwith "bad parse - expected atom" let sb = new System.Text.StringBuilder() while i < max && input.[ i] <> ' ' && input.[ i] <> '(' && input.[ i] <> ')' do sb.Append(input.[ i]) |> ignore i <- i + 1 if sb.Length = 0 then failwith "bad parse - expected atom" Atom(sb.ToString()), i and parseData n = let i = skipWS(n) if input.[ i] = '(' then parseList i else parseAtom i let r, i = parseList 0 r let s = "(alpha (brown) (charlie delta)) echo (foxtrot golf)" printfn "%A" (parse ("("+s+")"))
Topic tags
- f# × 3707
- websharper × 2884
- core × 418
- bolero × 329
- compiler × 291
- enhancement × 215
- functional × 201
- bug × 177
- ui next × 140
- ui × 132
- c# × 122
- classes × 97
- web × 97
- .net × 84
- book × 84
- async × 77
- ui.next × 67
- templates × 58
- website × 51
- trywebsharper × 50
- question × 46
- html × 45
- server × 45
- owin × 44
- javascript × 43
- parallel × 43
- parsing × 41
- testing × 41
- typescript × 39
- template × 38
- sitelet × 31
- asynchronous × 30
- feature request × 28
- monad × 28
- ocaml × 28
- warp × 28
- tutorial × 27
- haskell × 26
- dotnet-ws × 23
- linq × 22
- sitelets × 22
- workflows × 22
- rpc × 21
- getting started × 20
- wpf × 20
- fpish × 19
- introduction × 19
- silverlight × 19
- monodevelop × 17
- piglets × 17
- suave × 17
- docs × 16
- collections × 15
- jquery × 15
- proposal × 15
- aspnetcore × 14
- pipeline × 14
- reactive × 14
- 4.6.0.361 × 13
- documentation × 13
- kendoui × 13
- formlets × 12
- 4.1.0.171 × 11
- monads × 11
- released: v0.1 × 11
- websocket × 11
- 4.4.0.280 × 10
- 4.4.1.288 × 10
- opinion × 10
- tryfsharponwasm × 10
- 4.0.190.100-rc × 9
- deployment × 9
- fixed × 9
- in × 9
- json × 9
- plugin × 9
- scheme × 9
- solid × 9
- wontfix × 9
- 4.3.0.274 × 8
- 4.5.4.317 × 8
- basics × 8
- concurrent × 8
- highcharts × 8
- how-to × 8
- mvu × 8
- python × 8
- released: v0.11 × 8
- 4.1.1.175 × 7
- 4.5.1.304 × 7
- complexity × 7
- remoting × 7
- visual studio × 7
- 4.1.2.178 × 6
- 4.5.4.151 × 6
- authentication × 6
- datefns × 6
- lisp × 6
- real-world × 6
- released in 4.0.192.103-rc × 6
- resources × 6
- scala × 6
- websharper ui.next × 6
- workshop × 6
- xaml × 6
- 4.0.193.110 × 5
- 4.2.11.258 × 5
- 4.2.3.236 × 5
- aspnetmvc × 5
- azure × 5
- bootstrap × 5
- conference × 5
- css × 5
- dsl × 5
- formlet × 5
- java × 5
- list × 5
- metaprogramming × 5
- ml × 5
- q&a × 5
- released in Zafir.4.0.188.91-beta10 × 5
- released: v0.4 × 5
- released: v0.8 × 5
- spa × 5
- sql × 5
- visualstudio × 5
- websharper.forms × 5
- zafir × 5
- 4.0.192.106 × 4
- 4.0.195.127 × 4
- 4.1.0.38 × 4
- 4.2.1.86 × 4
- 4.2.13.263 × 4
- 4.2.6.118 × 4
- 4.5.5.155 × 4
- 4.6.4.404 × 4
- discussion × 4
- example × 4
- extension × 4
- extensions × 4
- fsi × 4
- fsx × 4
- help wanted × 4
- highlightjs × 4
- html5 × 4
- jqueryui × 4
- lift × 4
- performance × 4
- qna × 4
- react × 4
- reflection × 4
- released: v0.10 × 4
- released: v0.5 × 4
- remote × 4
- rest × 4
- teaching × 4
- todomvc × 4
- 4.0.196.147 × 3
- 4.1.0.34 × 3
- 4.1.6.207 × 3
- 4.2.1.223-beta × 3
- 4.2.14.264 × 3
- 4.2.4.114 × 3
- 4.2.4.247 × 3
- 4.2.5.115 × 3
- 4.2.6.253 × 3
- 4.2.9.256 × 3
- 4.5.0.140 × 3
- 4.5.0.290 × 3
- 4.5.18.348 × 3
- 4.5.2.309 × 3
- 4.5.8.327 × 3
- 4.6.2.386 × 3
- ajax × 3
- alt.net × 3
- aml × 3
- asp.net mvc × 3
- build × 3
- canvas × 3
- cloudsharper × 3
- compilation × 3
- d3 × 3
- data × 3
- database × 3
- erlang × 3
- events × 3
- file upload × 3
- forums × 3
- how to × 3
- http × 3
- inline × 3
- issue × 3
- kendo × 3
- macro × 3
- materialui × 3
- mono × 3
- msbuild × 3
- mvc × 3
- pattern × 3
- piglet × 3
- released in Zafir.4.0.187.90-beta10 × 3
- released: v0.12 × 3
- released: v0.9 × 3
- svg × 3
- type provider × 3
- view × 3
- websharper4 × 3
- 4.1.1.64 × 2
- 4.1.5.203 × 2
- 4.1.7.232 × 2
- 4.2.10.257 × 2
- 4.2.3.111 × 2
- 4.2.5.249 × 2
- 4.3.0.127 × 2
- 4.3.1.275 × 2
- 4.5.10.166 × 2
- 4.5.10.332 × 2
- 4.5.15.342 × 2
- 4.5.19.349 × 2
- 4.5.3.146 × 2
- 4.5.9.301 × 2
- android × 2
- api × 2
- asp.net × 2
- beginner × 2
- blog × 2
- chart × 2
- client × 2
- client server app × 2
- clojure × 2
- computation expressions × 2
- constructor × 2
- corporate × 2
- courses × 2
- cufp × 2
- debugging × 2
- direct × 2
- discriminated union × 2
- dom × 2
- elm × 2
- endpoint × 2
- endpoints × 2
- enterprise × 2
- entity framework × 2
- event × 2
- f# interactive × 2
- fable × 2
- flowlet × 2
- formdata × 2
- forms × 2
- fsc × 2
- fsharp × 2
- google × 2
- google maps × 2
- hosting × 2
- https × 2
- iis 8.0 × 2
- install × 2
- interactive × 2
- interface × 2
- iphone × 2
- iteratee × 2
- jobs × 2
- jquery mobile × 2
- keynote × 2
- lens × 2
- lenses × 2
- linux × 2
- listmodel × 2
- mac × 2
- maps × 2
- numeric × 2
- oauth × 2
- obfuscation × 2
- offline × 2
- oop × 2
- osx × 2
- packaging × 2
- pattern matching × 2
- pipelines × 2
- post × 2
- quotation × 2
- reference × 2
- released in Zafir.4.0.185.88-beta10 × 2
- released: v0.13 × 2
- released: v0.6 × 2
- remarkable × 2
- rx × 2
- script × 2
- security × 2
- self host × 2
- seq × 2
- sockets × 2
- stm × 2
- sweetalert × 2
- tcp × 2
- trie × 2
- tutorials × 2
- type × 2
- url × 2
- var × 2
- websharper.charting × 2
- websockets × 2
- wig × 2
- xna × 2
- zh × 2
- .net framework × 1
- .net interop × 1
- 2012 × 1
- 4.0.194.126 × 1
- 4.1.3.184 × 1
- 4.1.4.189 × 1
- 4.2.0.214-beta × 1
- 4.2.12.259 × 1
- 4.2.2.231-beta × 1
- 4.2.8.255 × 1
- 4.4.1.137 × 1
- 4.5.1.141 × 1
- 4.5.11.334 × 1
- 4.5.12.177 × 1
- 4.5.13.318 × 1
- 4.5.13.338 × 1
- 4.5.16.344 × 1
- 4.5.2.145 × 1
- 4.5.3.144 × 1
- 4.5.3.310 × 1
- 4.5.5.319 × 1
- 4.5.6.156 × 1
- 4.5.6.320 × 1
- 4.5.7.322 × 1
- 4.5.8.161 × 1
- 4.5.9.164 × 1
- 4.6.1.127 × 1
- 4.6.1.381 × 1
- 4.6.3.388 × 1
- 4.6.5.406 × 1
- 4.6.6.407 × 1
- Canvas Sample Example × 1
- DynamicStyle Animated Style × 1
- ES8 × 1
- Fixed in 4.0.190.100-rc × 1
- Metro-Ui-Css × 1
- Metro4 × 1
- Released in Zafir.UI.Next.4.0.169.79-beta10 × 1
- SvgDynamicAttribute × 1
- Swiper × 1
- WebComponent × 1
- WebSharper.TypeScript × 1
- abstract class × 1
- accumulator × 1
- active pattern × 1
- actor × 1
- addin × 1
- agents × 1
- aggregation × 1
- agile × 1
- alter session × 1
- animation × 1
- anonymous object × 1
- apache × 1
- appcelerator × 1
- architecture × 1
- array × 1
- arrays × 1
- asp.net 4.5 × 1
- asp.net core × 1
- asp.net integration × 1
- asp.net mvc 4 × 1
- asp.net web api × 1
- aspnet × 1
- ast × 1
- attributes × 1
- authorization × 1
- b-tree × 1
- back button × 1
- badimageformatexception × 1
- bash script × 1
- batching × 1
- binding-vars × 1
- bistro × 1
- body × 1
- bundle × 1
- camtasia studio × 1
- cas protocol × 1
- charts × 1
- clarity × 1
- class × 1
- cli × 1
- clipboard × 1
- clojurescript × 1
- closures × 1
- cloud × 1
- cms × 1
- code-review × 1
- coding diacritics × 1
- color highlighting × 1
- color zones × 1
- combinator × 1
- combinators × 1
- compile × 1
- compile code on server × 1
- config × 1
- confirm × 1
- content × 1
- context × 1
- context.usersession × 1
- continuation-passing style × 1
- coords × 1
- cordova × 1
- cors × 1
- coursera × 1
- cross-domain × 1
- csla × 1
- current_schema × 1
- custom content × 1
- data grid × 1
- datetime × 1
- debug × 1
- declarative × 1
- delete × 1
- devexpress × 1
- dhtmlx × 1
- dictionary × 1
- directattribute × 1
- disqus × 1
- distance × 1
- do binding × 1
- doc elt ui.next upgrade × 1
- docker × 1
- dojo × 1
- dol × 1
- domain × 1
- dotnet core × 1
- du × 1
- duf-101 × 1
- dynamic × 1
- eastern language × 1
- eclipse × 1
- edsl × 1
- em algorithm × 1
- emacs × 1
- emotion × 1
- enums × 1
- error × 1
- etw × 1
- euclidean × 1
- eventhandlerlist × 1
- examples × 1
- ext js × 1
- extension methods × 1
- extjs × 1
- extra × 1
- facet pattern × 1
- failed to translate × 1
- fake × 1
- fantomas × 1
- fear × 1
- float × 1
- form × 1
- form-data × 1
- forum × 1
- fp × 1
- frank × 1
- fsdoc × 1
- fsharp.core × 1
- fsharp.powerpack × 1
- fsharpx × 1
- fsunit × 1
- function × 1
- functional style × 1
- game × 1
- games × 1
- gc × 1
- generic × 1
- geometry × 1
- getlastwin32error × 1
- getting-started × 1
- good first issue × 1
- google visualization timeline × 1
- google.maps × 1
- grid × 1
- group × 1
- guide × 1
- hash × 1
- headers × 1
- hello world example × 1
- heroku × 1
- highchart × 1
- history × 1
- html-templating × 1
- http405 × 1
- httpcontext × 1
- hubfs × 1
- i18n × 1
- ide × 1
- ie 8 × 1
- if-doc × 1
- iis × 1
- image × 1
- images × 1
- inheritance × 1
- initialize × 1
- input × 1
- install "visual studio" × 1
- installer × 1
- int64 × 1
- interfaces × 1
- internet explorer × 1
- interop × 1
- interpreter × 1
- invalid × 1
- io × 1
- iobservable × 1
- ios × 1
- iot × 1
- ipad × 1
- isomorphic × 1
- javascript optimization × 1
- javascript semanticui resources × 1
- jquery-plugin × 1
- jquery-ui × 1
- jquery-ui-datepicker × 1
- jquerymobile × 1
- js × 1
- kendo datasource × 1
- kendochart × 1
- kendoui compiler × 1
- knockout × 1
- l10n × 1
- leaflet × 1
- learning × 1
- library × 1
- libs × 1
- license × 1
- licensing × 1
- lineserieszonescfg × 1
- local setting × 1
- localization × 1
- logging × 1
- loop × 1
- macros × 1
- mailboxprocessor × 1
- mapping × 1
- markerclusterer × 1
- markup × 1
- marshal × 1
- math × 1
- mathjax × 1
- message × 1
- message passing × 1
- message-passing × 1
- meta × 1
- metro style × 1
- metro-ui × 1
- micro orm × 1
- minimum-requirements × 1
- mix × 1
- mobile installation × 1
- mod_mono × 1
- modal × 1
- module × 1
- mouseevent × 1
- mouseposition × 1
- multidimensional × 1
- multiline × 1
- multithreading × 1
- mysql × 1
- mysqlclient × 1
- nancy × 1
- native × 1
- nested × 1
- nested loops × 1
- netstandard × 1
- node × 1
- nunit × 1
- object relation mapper × 1
- object-oriented × 1
- om × 1
- onboarding × 1
- onclick × 1
- optimization × 1
- option × 1
- orm × 1
- os x × 1
- output-path × 1
- override × 1
- paper × 1
- parameter × 1
- persistence × 1
- persistent data structure × 1
- phonegap × 1
- plotly × 1
- pola × 1
- powerpack × 1
- prefix tree × 1
- principle of least authority × 1
- privacy × 1
- private × 1
- profile × 1
- programming × 1
- project × 1
- project euler × 1
- projekt_feladat × 1
- protected × 1
- provider × 1
- proxy × 1
- ptvs × 1
- public × 1
- pure f# × 1
- purescript × 1
- quant × 1
- query sitelet × 1
- quotations × 1
- range × 1
- raphael × 1
- razor × 1
- rc × 1
- reactjs × 1
- real-time × 1
- ref × 1
- region × 1
- released in 4.0.190.100-rc × 1
- released: v0.2 × 1
- released: v0.3 × 1
- released: v0.7 × 1
- reporting × 1
- responsive design × 1
- rest api × 1
- rest sitelet × 1
- restful × 1
- round table × 1
- router × 1
- routing × 1
- rpc reverseproxy × 1
- runtime × 1
- sales × 1
- sample × 1
- sampleapp × 1
- scriptcs × 1
- scripting × 1
- search × 1
- self hosted × 1
- semanticui × 1
- sequence × 1
- serialisation × 1
- service × 1
- session-state × 1
- sharepoint × 1
- signals × 1
- sitelet website × 1
- sitelet.protect × 1
- sitlets × 1
- slickgrid × 1
- source code × 1
- sqlentityconnection × 1
- ssl × 1
- standards × 1
- static content × 1
- stickynotes × 1
- streamreader × 1
- stress × 1
- strong name × 1
- structures × 1
- submitbutton × 1
- subscribe × 1
- svg example html5 websharper.ui.next × 1
- system.datetime × 1
- system.reflection.targetinvocationexception × 1
- table storage × 1
- targets × 1
- tdd × 1
- template ClientServer × 1
- templates ui.next × 1
- templating × 1
- text parsing × 1
- three.js × 1
- time travel × 1
- tls × 1
- tooltip × 1
- tracing × 1
- tsunamiide × 1
- turkish × 1
- twitter-bootstrap × 1
- type erasure × 1
- type inference × 1
- type providers × 1
- type-providers × 1
- typeprovider × 1
- ui next forms × 1
- ui-next × 1
- ui.next jqueryui × 1
- ui.next charting × 1
- ui.next formlets × 1
- ui.next forms × 1
- ui.next suave visualstudio × 1
- ui.next templating × 1
- unicode × 1
- unittest client × 1
- up for grabs × 1
- upload × 1
- usersession × 1
- validation × 1
- vb × 1
- vb.net × 1
- vector × 1
- view.map × 1
- visal studio × 1
- visual f# × 1
- visual studio 11 × 1
- visual studio 2012 × 1
- visual studio code × 1
- visual studio shell × 1
- visualstudio-websharper × 1
- vs2017 compiler zafir × 1
- vsix × 1
- web api × 1
- web-scraping × 1
- webapi × 1
- webcomponents × 1
- webforms × 1
- webgl × 1
- webrtc × 1
- webshaper × 1
- websharper async × 1
- websharper codemirror × 1
- websharper f# google × 1
- websharper forms × 1
- websharper reactive × 1
- websharper rpc × 1
- websharper sitelets routing × 1
- websharper warp × 1
- websharper-interface-generator × 1
- websharper.chartsjs × 1
- websharper.com × 1
- websharper.exe × 1
- websharper.owin × 1
- websharper.ui.next × 1
- websharper.ui.next jquery × 1
- websockets iis × 1
- webspeech × 1
- why-websharper × 1
- windows 7 × 1
- windows 8 × 1
- windows-phone × 1
- winrt × 1
- www.grabbitmedia.com × 1
- xamarin × 1
- xml × 1
- yeoman × 1
- yield × 1
- zafir beta × 1
- zafir websharper4 × 1
- zarovizsga × 1
![]() |
Copyright (c) 2011-2012 IntelliFactory. All rights reserved. Home | Products | Consulting | Trainings | Blogs | Jobs | Contact Us | Terms of Use | Privacy Policy | Cookie Policy |
Built with WebSharper |
Hi all,
Let say there is a nested bracket string "(alpha (brown) (charlie delta)) echo (foxtrot golf)". I want to collect tokens recursively based on the brackets so the tokens will be
Beginning it splits into 3 tokens
(alpha (brown) (charlie delta))
echo
(foxtrot golf)
and then the first token can be split into 3 tokens again
alpha
brown
(charlie delta)
and then the last token "(charlie delta)" are split into 2 tokens again.
"echo" is a token.
"(foxtrot golf)" should also be split into 2 tokens again.
I think I need to use some sort of Regex pattern for this but can't figure out how... Any suggestions?
Thank you very much.