Conversation
lib/streamlit/hashing.py
Outdated
There was a problem hiding this comment.
If this is an actual limitations, we should link an issue to it. Since I am changing this code, I am removing it.
There was a problem hiding this comment.
Agreed.
I don't think this is a use-case that matters for Streamlit authors anyway.
There was a problem hiding this comment.
I moved this to black_list_test. Let me know if I missed some coverage.
lib/streamlit/black_list.py
Outdated
lib/streamlit/black_list.py
Outdated
There was a problem hiding this comment.
I have a PR that fixes this function. Please merge in the changes. See #265
There was a problem hiding this comment.
Ok, I can just merge it in. Seems your PR is going to be merged earlier.
lib/streamlit/hashing.py
Outdated
There was a problem hiding this comment.
I have to think more deeply about this, but just something I noticed: this code uses cwd whereas LocalSourcesWatcher uses the folder where the main script lives.
Does it make sense to unify the two behaviors? If so, which is best: cwd or main folder?
There was a problem hiding this comment.
I wonder this is just an artifact of the code being written by two different people. I do think there might be issues with both. What if the main script is in a subdirectory? And what if we execute streamlit from a path that does not include CWD? Should be document and make this a user-option. We can make clear that only a subset of files will be watched/checked for mutation.
I like this to be the same for FileWatcher and here.
lib/streamlit/black_list.py
Outdated
There was a problem hiding this comment.
Move this to a utility file, so it can be reused in LocalSourcesWatcher
There was a problem hiding this comment.
BlackList would be shared, so we should not need to move this to a utility.
|
@tvst Let's chat whether we want this to be shared with FileWatcher. Let's also clarify the role of CWD/main folder path |
We decided to share this with FileWatcher and to use the main script folder. |
lib/streamlit/hashing.py
Outdated
| if util.file_is_in_folder( | ||
| filepath, _get_main_script_directory() | ||
| ) and not self._folder_black_list.is_blacklisted(filepath): |
There was a problem hiding this comment.
I am separately checking
- whether the file is in the current directory
- whether the file is not in the blacklist
| _file_is_in_folder(filepath, blacklisted_folder) | ||
| for blacklisted_folder in self._folder_blacklist | ||
| ) | ||
|
|
||
| if is_in_blacklisted_folder: | ||
| if self._folder_black_list.is_blacklisted(filepath): | ||
| continue | ||
|
|
||
| file_is_new = filepath not in self._watched_modules | ||
| file_is_local = _file_is_in_folder(filepath, self._report.script_folder) | ||
| file_is_local = util.file_is_in_folder( | ||
| filepath, self._report.script_folder | ||
| ) |
There was a problem hiding this comment.
Same as hashing.py, I am leaving i) checking if the file is local and ii) checking against the blacklist separate.
Also, here we use then self._report.script_folder as the main script folder.
lib/streamlit/hashing.py
Outdated
| main_path = __main__.__file__ | ||
| return os.path.dirname(main_path) |
There was a problem hiding this comment.
I am assuming this happens while I am executing a streamlit script and this is well defined.
lib/tests/streamlit/util_test.py
Outdated
| self.assertTrue(util.file_is_in_folder("/user/name/test", "/user")) | ||
| self.assertTrue(util.file_is_in_folder("/user/name/test", "/user/*")) | ||
| self.assertFalse(util.file_is_in_folder("/user/name/test", "/user/other")) |
There was a problem hiding this comment.
basic tests for this function.
|
I did not include the check for a file to be local in the blacklist. It seems to be it does not belong there semantically and would be a bit opaque, but there is some replication since the two checks are in the File Watcher and in |
| import __main__ | ||
| import os | ||
|
|
||
| main_path = __main__.__file__ |
There was a problem hiding this comment.
Does this point to the right file? In some toy tests on my computer, it seems to point to the equivalent of cli.py
Here's what I did:
# main_test.py
with open('main_subtest.py') as script_file:
script = script_file.read()
exec(script, {})# main_subtest.py
import streamlit as st
import __main__
st.write(__main__.__file__)Expected: should see "main_subtest.py" in Streamlit app
Actual: saw "main_test.py" in Streamlit app
There was a problem hiding this comment.
We do this, before running:
sys.modules["__main__"] = module
There was a problem hiding this comment.
Actually, happens here:
module.__dict__["__file__"] = self._report.script_path
There was a problem hiding this comment.
Which is what we are looking for, right?
I think if we like this solution we need at the very least a test.
There was a problem hiding this comment.
You're right. This code is fine, then!
Can you add a comment before line 415 explaining why this is fine?
There was a problem hiding this comment.
Will add the comment.
* develop: Fullscreen button to expand tables and charts (streamlit#137) Blacklist for hashing (streamlit#254) Adding utils and tests for hello.py (streamlit#261) Fix port-finding bugs (streamlit#280) Fix hello.py usage of os.path.join on a URL (streamlit#285)

Issue: #242
Description: Add a blacklist reusing code from the File Watcher. I extracted the code for the blacklist from the FileWatcher. This allows us to have a single point of configuration and some reuse. I added the logic to a class called
BlackList. Also added tests, mostly moving them out of the FileWatcher tests.Contribution License Agreement
By submiting this pull request you agree that all contributions to this project are made under the Apache 2.0 license.