Ticket #107 (closed Defect: fixed)

Opened 13 years ago

Last modified 13 years ago

File browser proxy: URL error 404

Reported by: silke Owned by: JuergeN
Priority: Major Milestone:
Component: DeepaMehta Standard Distribution Version: 4.0.4
Keywords: Cc:
Complexity: 3 Area: Runtime
Module:

Description

Creating a new file browser does not work with Apache SSL proxy.

DeepaMehta shows "FolderContentRendererError?: AJAX GET request failed, server response: 404 (Not Found), exception: undefined"

I created a folder containing a .txt file manually, on the server. I cannot access both in the DM interface:

Clicking on the folder icon, Apache log shows:

[02/Oct/2011:17:44:08 +0200] "GET /proxy/file%3A%2F? HTTP/1.1" 404 628 "https://172.16.9.40/" "Mozilla/5.0 (X11; Linux x86_64; rv:7.0) Gecko/20100101 Firefox/7.0"

Clicking on the file, it shows

[02/Oct/2011:17:45:09 +0200] "GET /proxy/file%3A%2Fdeepamehta-files.txt?type=text/plain&size=13090 HTTP/1.1" 404 628 "https://172.16.9.40/" "Mozilla/5.0 (X11; Linux x86_64; rv:7.0) Gecko/20100101 Firefox/7.0"

Without Apache SSL proxy it works. The difference in the log is the "?" at the end of the file/folder name.

Change History

comment:1 Changed 13 years ago by jri

Thank you for detailed reporting!

This could be an URL en/decoding issue. Could it be the Apache proxy decodes the request before passing it to the DM webserver? This would explain the 404.
(http://en.wikipedia.org/wiki/Percent-encoding)

I'm not familar with Apache reverse proxy configurations. Is there such a "URL encoding" config option? This should be turned off. Apache must *not* decode the URL before passing the request to the DM webserver.

Do I understand right: once "Create New File Browser" you see the top-level directory listing (in the content panel) but you can't navigate any further? (I'm wondering how you can click the file when you don't see the directory listing.)

For the moment I have 2 suggestions:

1) You check weather Apache provides such a config option. (Perhaps I'm on the wrong track here. In any case I'm interested to learn from you :-)

2) I change the DM file proxy implementation to let it handle decoded requests as well. For testing: do you build DM from source or do you prefer the binary distribution?

BTW: thank you very much too for adapting the Ubuntu doku to DM 4.0.4!

comment:2 Changed 13 years ago by JuergeN

Hi Jörg, I agree that this is most likely an encoding issue. I have done some research on Apache Proxy and ecoding yesterday already, but I could not find a way to turn it off for the specific version. I will have a closer look on that today again, but if you see a chance to do a quick fix on your implementation, that would be the better way to solve this problem anyway, I would say.

Sadly this is a true show stopper for server side installations for the moment.

comment:3 Changed 13 years ago by Jörg Richter

Proxy module: address a 404 issue (#107).

1st try to fix a 404 issue in conjunction with a reverse proxy setup.

See ticket 107.

comment:4 Changed 13 years ago by jri

I made a patch in master.

Please try again.

comment:5 Changed 13 years ago by JuergeN

Jörg, could you do a binary snapshop. This would be ideal and much faster for me for testing? Thx Juergen.

comment:6 Changed 13 years ago by silke

Hi Jörg,
I still have the same problem as yesterday with your latest commit.
Cheers,
Silke

comment:7 Changed 13 years ago by jri

Jürgen, you can download the snapshot here:
https://github.com/downloads/jri/deepamehta/deepamehta-4.0.5-20111003.zip
Its the state of this afternoon.

comment:8 Changed 13 years ago by jri

Thank you, Silke, for testing!

A deeper inspection is required. Can you please send DM's console output?
(Not Apache ones.)
Please capture the output while performing "Create New File Browser" and (in case the listing appears) clicking some files/folders.

comment:9 follow-up: ↓ 10 Changed 13 years ago by silke

Moin!
According to JuergeN's advice I changed the dm filedir by manually adding a files directory inside the dm home directory. I specified the path in pom.xml <dm4.proxy.files.path>.
I access the web interface on port 443.

The DM console's output:

Trying to add a new file browser: no output, nothing happens.

Clicking on a folder:

g! 04.10.2011 09:17:18 de.deepamehta.core.impl.service.EmbeddedService getRelatedTopics
INFO: topicId=2371, assocTypeUri="null", myRoleTypeUri="null", othersRoleTypeUri="null", othersTopicTypeUri="null", maxResultSize=100

In contrast, when I access the web interface on port 8080, I can create a folder - though I am not sure where that folder/any files are stored now, same path as above, JuergeN???

04.10.2011 09:27:05 de.deepamehta.plugins.proxy.ProxyPlugin getResource
INFO: Requesting resource "file:/" (mediaType="null", size=0)
04.10.2011 09:27:05 de.deepamehta.plugins.proxy.ProxyPlugin checkRemoteAccess
INFO: Checking remote access to "http://172.16.9.40:8080/proxy/file%3A%2F"
      dm4.proxy.net.filter="127.0.0.1/32", remote address="172.16.9.72" => FORBIDDEN
04.10.2011 09:27:05 de.deepamehta.core.impl.service.EmbeddedService getRelatedTopics
INFO: topicId=2289, assocTypeUri="null", myRoleTypeUri="null", othersRoleTypeUri="null", othersTopicTypeUri="null", maxResultSize=100

Clicking on that folder gives 403 Forbidden:

g! 04.10.2011 09:34:04 de.deepamehta.plugins.proxy.ProxyPlugin getResource
INFO: Requesting resource "file:/" (mediaType="null", size=0)
04.10.2011 09:34:04 de.deepamehta.plugins.proxy.ProxyPlugin checkRemoteAccess
INFO: Checking remote access to "http://172.16.9.40:8080/proxy/file%3A%2F"
      dm4.proxy.net.filter="127.0.0.1/32", remote address="172.16.9.72" => FORBIDDEN
04.10.2011 09:34:04 de.deepamehta.core.impl.service.EmbeddedService getRelatedTopics
INFO: topicId=2289, assocTypeUri="null", myRoleTypeUri="null", othersRoleTypeUri="null", othersTopicTypeUri="null", maxResultSize=100

comment:10 in reply to: ↑ 9 ; follow-up: ↓ 11 Changed 13 years ago by jri

Replying to silke:

Thank you for the log! Its revealing :)

I access the web interface on port 443.

While you can run the DM server (including the web interface) on port 443 HTTPS is currently *not* supported. Only HTTP is.

Trying to add a new file browser: no output, nothing happens.

For me its puzzling that HTTP via 443 vs. HTTP via 8080 makes a difference at all. Anyway, (for the moment) there is no reason for running DM at port 443.

In contrast, when I access the web interface on port 8080, I can create a folder - though I am not sure where that folder/any files are stored now, same path as above, JuergeN???

DM never creates files/folders in the filesystem. Instead it creates File/Folder? topics which represent existing filesystem files/folders.

With DM's dm4.proxy.files.path config option you control which part of your filesystem is visible to DM. Set an absolute path (with no trailing slash) to an existing folder. Or set nothing (empty string) to make the entire filesystem visible.

...
INFO: Checking remote access to "http://172.16.9.40:8080/proxy/file%3A%2F"
      dm4.proxy.net.filter="127.0.0.1/32", remote address="172.16.9.72" => FORBIDDEN
...

This shows you have a permission problem. You must set DM's dm4.proxy.net.filter config option. Using this option you control which clients have access to your filesystem. You can restrict client access by the means of a network filter. Clients not belonging to that network have no access. DM's default network filter is 127.0.0.1/32

=> Please set dm4.proxy.net.filter to e.g. 172.16.9.0/24
This grants access to all clients in the 172.16.9 network

You set the network filter in ./pom.xml (when build from source) resp. in ./conf/config.properties (for the binary distribution).

Hope this helps.

Your original 404 is still puzzling as FORBIDDEN results in a 403.

comment:11 in reply to: ↑ 10 Changed 13 years ago by JuergeN

Replying to jri:

Replying to silke:

Thank you for the log! Its revealing :)

I access the web interface on port 443.

While you can run the DM server (including the web interface) on port 443 HTTPS is currently *not* supported. Only HTTP is.

I think this is some kind of a misunderstanding! Port 443 equals access through the Apache SSL Proxy - which is indeed runnning on port 443 and creates this problem.

Trying to add a new file browser: no output, nothing happens.

For me its puzzling that HTTP via 443 vs. HTTP via 8080 makes a difference at all. Anyway, (for the moment) there is no reason for running DM at port 443.

In contrast, when I access the web interface on port 8080, I can create a folder - though I am not sure where that folder/any files are stored now, same path as above, JuergeN???

8080 is the direct connection to felix. The path is exacly what you configured in the config file.

DM never creates files/folders in the filesystem. Instead it creates File/Folder? topics which represent existing filesystem files/folders.

With DM's dm4.proxy.files.path config option you control which part of your filesystem is visible to DM. Set an absolute path (with no trailing slash) to an existing folder. Or set nothing (empty string) to make the entire filesystem visible.

...
INFO: Checking remote access to "http://172.16.9.40:8080/proxy/file%3A%2F"
      dm4.proxy.net.filter="127.0.0.1/32", remote address="172.16.9.72" => FORBIDDEN
...

This shows you have a permission problem. You must set DM's dm4.proxy.net.filter config option. Using this option you control which clients have access to your filesystem. You can restrict client access by the means of a network filter. Clients not belonging to that network have no access. DM's default network filter is 127.0.0.1/32

=> Please set dm4.proxy.net.filter to e.g. 172.16.9.0/24
This grants access to all clients in the 172.16.9 network

You set the network filter in ./pom.xml (when build from source) resp. in ./conf/config.properties (for the binary distribution).

Hope this helps.

Your original 404 is still puzzling as FORBIDDEN results in a 403.

Please do not get too confused about this. When accessing DM directly one can create a New File Browser and everything is fine, once the IP range is part of the allowed addresses. But when accssing DM through the Apache SSL proxy (=port 443) one has the right IP address (localhost=127.0.0.1) but due to the encoding problem one gets 404 as the result.

comment:12 Changed 13 years ago by jri

If my original hypotheses -- URL encoding problem -- would apply the yesterdays patch should have fix that. According to your reports this is not the case.

According to Silke I understand that this is the only output in the 443 scenario when performing "Create New File Browser":

04.10.2011 09:17:18 de.deepamehta.core.impl.service.EmbeddedService getRelatedTopics
INFO: topicId=2371, assocTypeUri="null", myRoleTypeUri="null", othersRoleTypeUri="null", othersTopicTypeUri="null", maxResultSize=100

This means the requests to the DM proxy (not the Apache proxy) are not reaching DM at all.

Normally the log should look like this:

Oct 4, 2011 10:32:21 PM de.deepamehta.plugins.proxy.ProxyPlugin getResource
INFO: Requesting resource "file:/" (mediaType="null", size=0)
Oct 4, 2011 10:32:21 PM de.deepamehta.plugins.proxy.ProxyPlugin checkRemoteAccess
INFO: Checking remote access to "http://localhost:8080/proxy/file%3A%2F"
      dm4.proxy.net.filter="127.0.0.1/32", remote address="127.0.0.1" => ALLOWED
Oct 4, 2011 10:32:21 PM de.deepamehta.plugins.proxy.ProxyPlugin checkFileAccess
INFO: Checking file repository access to "/"
      dm4.proxy.files.path="", canonical request path="/" => ALLOWED
Oct 4, 2011 10:32:21 PM de.deepamehta.core.impl.service.EmbeddedService getRelatedTopics
INFO: topicId=2246, assocTypeUri="null", myRoleTypeUri="null", othersRoleTypeUri="null", othersTopicTypeUri="null", maxResultSize=100

The yesterdays patch changed the recognition pattern for the proxy requests to

/proxy/{.*}

That is the begin /proxy/ followed by arbitrary characters.
This should match even for decoded URLs (DM proxy requests contain slashes *inside* path segments and these are subject of URL encoding).
But apparently the request send by Apache to DM doesn't match that pattern.
So the Jersey Servlet (the REST toolkit responsible for dispatching incoming requests to Java methods) doesn't pass the request to DM and responses to Apache with a 404 instead. DM doesn't see the request.

From my point of view the Apache proxy corrupts the DM proxy requests or doesn't pass the DM proxy requests at all.

Without extra effort (e.g. putting a request logger servlet in the request chain to see *all* requests) I can't track down the problem.

Does the Apache proxy provide a log of the requests as *passed to DM* (not as *received by Apache*)?

What does your Apache reverse proxy configuration look like?

Before making it too complicate and to avoid further misunderstanding we could meet and live investigate.


Additional info:

A DM proxy request looks like this:

/proxy/file:%2Fhome%2Fterry%2Fdesktop%2F31694.jpg

In decoded form:

/proxy/file:/home/terry/desktop/31694.jpg

Both forms match the pattern above while the decoded form doesn't match the original pattern (before the patch):

/proxy/{[^/]+?}
Last edited 13 years ago by jri (previous) (diff)

comment:13 Changed 13 years ago by JuergeN

After a longer session of debugging today, I found out that the original URL seems to be modified by the Apache proxy server. In particular it looks like this:

URL passed from Apache proxy to jetty:

com.sun.jersey.api.NotFoundException: null for uri: http://localhost:8280/files/folder//

Do not get confused about the two //, but much more about the fact that the URL should start with /proxy/ as you mentioned.

comment:14 Changed 13 years ago by Jörg Richter

Proxy module: address a 404 issue, 2nd try (#107).

2nd try to fix a 404 issue in conjunction with a reverse proxy setup.
Now we flush the response stream.

See ticket 107.

comment:15 Changed 13 years ago by Jörg Richter

Files module: address a 404 issue, 3rd try (#107).

3rd try to fix a 404 issue in conjunction with a reverse proxy setup.
Change regex for POST /files/folder requests.

See ticket 107.

comment:16 Changed 13 years ago by Jörg Richter

Files module: address a 404 issue (#107).

Change regex for the other requests as well.

See ticket 107.

comment:17 Changed 13 years ago by Jörg Richter

  • Status changed from new to closed
  • Resolution set to fixed

Wrapping up the Apache reverse proxy fix (#107).

The original problem was that DM's file related features didn't work if and
only if DM was running behind an Apache reverse proxy.

What we've learned:

1) Apache decodes URLs before passing the request to the backend application!

As a consequence Jersey's default pattern for matching the request's path
segments ([/]+?) does *not* match if the URL contains encoded slashes
(e.g. when a path is encoded *within* a path segment). This caused Jersey
to respond with 404.
The solution was to use a custom pattern (.+) for Jersey.

2) By default Apache doesn't allow encoded slashes in URLs at all! Such URLs

are refused with a 404.
The solution was to modify this behavoir by the means of Apache's
AllowEncodedSlashes? directive.

Thanks to JuergenN and Silke for helping to track things down!

Close ticket 107.

Version 0, edited 13 years ago by Jörg Richter (next)
Note: See TracTickets for help on using tickets.