• Login
  • login to access premium content (it's free).

Parsing URL's and file paths

ramblings about maintainable code

So today I wrote this ugly little piece of code

<<
 reverse(chopleft(replace(reverse(replaceall(
  getenv('DOCUMENT_ROOT')+'/'
 ,'//','/')),'/',''),'/'))
>>

The task I am trying to accomplish is I want to take DOCUMENT_ROOT and remove the last part of the path, I wanted a one liner and I discovered that DOCUMENT_ROOT may not always have a trailing slash. So that added to the complexity of the one liner.

  1. Add Trailing /
  2. Replace all // with / (in case adding it wasn't needed)
  3. Reverse the result
  4. Remove the first /
  5. Chop everything to the left of the first /
  6. reverse again

These...

/home/docs/public_html
/home/docs/public_html/

Would produce...

/home/docs/

Done!

So It would be nice to have a URL/File Path parsing system. Perhaps something like...

pathdice('/1/2/3','/.../-1')  ==> /1/2
pathdice('/1/2/3/','/.../-1') ==> /1/2/

Are there libraries like this for other Languages? If so how do they tackle the problem?

The issue I am trying to solve is I want to have maintainable code, and a one liner like that is almost impossible to recognize the original intent of the one writing the code, so it is not maintainable.

If instead I had some generalized file path library and I used it then the intent of my code would be clear to the person looking it at some point in the future.

The code as written does give a few clues into my intent. First since I am asking for document_root I can see I am expecting to be dealing with a file path. And since I am using a forward slash then I understand the file paths not to have back slashes in them.

But beyond that the code isn't clear at all as to intent.

In the absence of any type of library heavy comments are probably the next best thing. Of course comments can be get out of sync with the actual code.

So would my original code be better written...

<<
 reverse(chopleft(replace(reverse(replaceall(
  getenv('DOCUMENT_ROOT')+'/'
 ,'//','/')),'/',''),'/'))

 #
 # 1. Add Trailing /
 # 2. Replace all // with / (in case adding it wasn't needed)
 # 3. Reverse the result
 # 4. Remove the first /
 # 5. Chop everything to the left of the first /
 # 6. reverse again
/#

>>