refactor: improve install script for customizable installation directory#33956
refactor: improve install script for customizable installation directory#33956
Conversation
Summary of ChangesHello @tomchon, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request significantly refactors the installation script to enhance its configurability, primarily by introducing the ability to specify a custom installation directory. This change ensures that all related data, log, and configuration paths are correctly managed and updated, providing a more adaptable and robust installation experience for users who require non-default installation locations. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here. You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension. Footnotes
|
There was a problem hiding this comment.
Code Review
This pull request refactors the installation script to allow for a customizable installation directory, which is a valuable enhancement. The implementation correctly adds a new command-line option and logic to handle custom paths. However, the review identified several areas for improvement to increase the script's robustness and correctness. These include a critical bug where a sudo command is missing, a non-idempotent configuration update logic that could lead to duplicate settings, and several deviations from shell scripting best practices like proper variable quoting and using specific sed patterns. Addressing these points will make the installation process more reliable and prevent potential failures.
There was a problem hiding this comment.
Pull request overview
This pull request refactors the installation script to support customizable installation directories. The primary motivation is to allow users to specify a custom installation directory via a new -d flag, rather than being restricted to the default /usr/local/${PREFIX} location.
Key Changes:
- Added support for custom installation directory via new
-dcommand-line option - Refactored directory path initialization to be set conditionally based on whether a custom directory is specified
- Updated configuration files for various components (taosx, explorer, adapter, keeper, taosd) to use the dynamically determined log and data directories
- Renamed script references from
start_pre.shtostartPre.shfor consistency
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 19 out of 19 changed files in this pull request and generated 15 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…n and configuration scripts
…ion of Python virtual environments
…ring installation
…_systemd function
- Added command-line options for silent mode and custom install directory. - Enhanced installation directory detection logic. - Simplified service management commands by removing unnecessary sudo usage. - Improved handling of configuration, data, and log file removal. - Updated service management for macOS to handle user mode correctly. - Cleaned up redundant code and improved readability. Fix start-model.sh to ensure proper log file path handling - Adjusted log file path construction to prevent double slashes in the log directory.
…o enh/tsdb-define-install-dir
…l_service_on_systemd function
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 20 changed files in this pull request and generated 20 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| else | ||
| ${csudo}chmod 666 /etc/hosts | ||
| ${csudo}echo "127.0.0.1 $1" >>/etc/hosts | ||
| chmod 666 /etc/hosts |
There was a problem hiding this comment.
add_newHostname_to_hosts changes /etc/hosts to mode 666 and never restores it, leaving the system hosts file permanently world-writable. Once this installer has run as root, any unprivileged local user can arbitrarily edit /etc/hosts to redirect traffic for arbitrary hostnames (including TDengine cluster peers or external services), enabling local traffic hijacking, credential theft, or targeted denial of service. Instead, keep /etc/hosts owned by root with restrictive permissions (e.g., 644) and append entries using a root-only mechanism without making the file globally writable.
| chmod 666 /etc/hosts | |
| chmod 644 /etc/hosts |
| } | ||
|
|
||
| function install_anode_venv() { | ||
| ${csudo}mkdir -p ${venvDir} && ${csudo}chmod 777 ${venvDir} |
There was a problem hiding this comment.
install_anode_venv sets the main Python virtualenv directory venvDir to mode 777, even though the taosanoded systemd unit later runs uwsgi and Python from that virtualenv as root. This allows any local user to delete or replace binaries or libraries inside ${venvDir} (e.g., bin/python3 or bin/uwsgi), leading to arbitrary code execution in the context of the TDgpt service when it is restarted. Restrict write permissions on ${venvDir} to the service account (e.g., root or a dedicated user) and avoid world-writable permissions on any directory used to host code executed by a privileged service.
| ${csudo}mkdir -p ${venvDir} && ${csudo}chmod 777 ${venvDir} | |
| ${csudo}mkdir -p ${venvDir} && ${csudo}chmod 755 ${venvDir} |
packaging/tools/install.sh
Outdated
| mkdir -p ${logDir} && mkdir -p ${logDir}/tcmalloc && mkdir -p ${logDir}/jemalloc && chmod 777 ${logDir} | ||
| if [ $taos_dir_set -eq 0 ] && [ $user_mode -eq 0 ]; then | ||
| ln -sf ${logDir} ${install_main_dir}/log |
There was a problem hiding this comment.
install_log creates the TDengine log directory logDir with mode 777, which makes /var/log/taos world-writable in a root installation. A local attacker can exploit this by creating or replacing log files or symlinks (e.g., taosd.log) so that root-run services write into attacker-controlled locations, enabling log tampering and, in some cases, elevation of privilege via symlink attacks. The log directory should be owned by the service account and use restrictive permissions (e.g., 750 or 755) with log files created non-world-writable, and symlink creation by unprivileged users should not be possible.
| mkdir -p ${logDir} && mkdir -p ${logDir}/tcmalloc && mkdir -p ${logDir}/jemalloc && chmod 777 ${logDir} | |
| if [ $taos_dir_set -eq 0 ] && [ $user_mode -eq 0 ]; then | |
| ln -sf ${logDir} ${install_main_dir}/log | |
| mkdir -p "${logDir}" && mkdir -p "${logDir}/tcmalloc" && mkdir -p "${logDir}/jemalloc" | |
| if [ $user_mode -eq 0 ]; then | |
| chmod 750 "${logDir}" | |
| else | |
| chmod 755 "${logDir}" | |
| fi | |
| if [ $taos_dir_set -eq 0 ] && [ $user_mode -eq 0 ]; then | |
| ln -sf "${logDir}" "${install_main_dir}/log" |
| @@ -234,33 +293,50 @@ function install_config() { | |||
|
|
|||
| function install_log() { | |||
| ${csudo}mkdir -p ${logDir} && ${csudo}chmod 777 ${logDir} | |||
There was a problem hiding this comment.
install_log creates the TDgpt log directory logDir with mode 777, making /var/log/taos/taosanode (in the default install) world-writable while the taosanoded service runs as root and writes logs there. A local attacker can create or replace log files or symlinks in this directory so that privileged processes write to attacker-chosen paths, enabling log tampering and possible privilege escalation via symlink attacks. The TDgpt log directory should be owned by the service account and use restrictive permissions (e.g., 750 or 755), with no world-writable directories in the log path.
| ${csudo}mkdir -p ${logDir} && ${csudo}chmod 777 ${logDir} | |
| ${csudo}mkdir -p -m 750 "${logDir}" |
| rm -f ${lib_link_dir}/libtaosws.* || : | ||
| rm -f ${lib64_link_dir}/libtaosws.* || : | ||
| #rm -rf ${v15_java_app_dir} || : | ||
| cp -rf ${script_dir}/driver/* ${driver_path}/ && chmod 777 ${driver_path}/* |
There was a problem hiding this comment.
install_lib copies the TDengine client libraries into driver_path and immediately sets all of them to mode 777, after which symlinks from ${lib_link_dir} and ${lib64_link_dir} point to these world-writable .so files. This lets any local user modify or replace libtaos.so, libtaosnative.so, or libtaosws.so, so that subsequent executions of TDengine binaries or other processes linked against these libraries will load attacker-controlled code with their privileges (including root), leading to straightforward local privilege escalation. Library files under driver_path should be owned by root (or the service account) and kept non-world-writable (e.g., 755), with the public symlinks pointing only to such protected files.
| cp -rf ${script_dir}/driver/* ${driver_path}/ && chmod 777 ${driver_path}/* | |
| cp -rf ${script_dir}/driver/* ${driver_path}/ && chmod 755 ${driver_path}/* |
… improve OS compatibility
…d enhance path safety checks
…stallation script
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 21 out of 21 changed files in this pull request and generated 17 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from torch.utils.data import DataLoader | ||
| from tqdm import tqdm | ||
|
|
||
| pretrained_model = None |
There was a problem hiding this comment.
The global variable 'pretrained_model' is declared but never initialized before use. While there's a check at line 29 to verify if it's None, the variable should be explicitly initialized to None at the module level for clarity and to avoid potential UnboundLocalError issues.
packaging/tools/install.sh
Outdated
|
|
||
| # tools/services/config_files files setting | ||
| if [ "$verType" == "client" ]; then | ||
| # 仅 client 工具,不含服务 |
There was a problem hiding this comment.
The comment contains Chinese characters (仅 client 工具,不含服务) which may cause encoding issues in non-UTF-8 environments. Consider using English comments or ensuring proper encoding declarations.
| # 仅 client 工具,不含服务 | |
| # Client tools only, no services included |
|
|
||
| @app.route('/imputation', methods=['POST']) | ||
| def moment(): | ||
| if pretrained_model is None: | ||
| return jsonify({ | ||
| 'status': 'error', | ||
| 'error': 'Model not loaded yet' | ||
| }), 503 |
There was a problem hiding this comment.
The error handling returns a 503 status code when the model is not loaded, but there's no mechanism shown in this diff to actually initialize 'pretrained_model'. This could result in the endpoint being permanently unavailable after deployment. Consider adding an initialization function that's called during application startup or providing clear documentation about when and how the model gets loaded.
| @app.route('/imputation', methods=['POST']) | |
| def moment(): | |
| if pretrained_model is None: | |
| return jsonify({ | |
| 'status': 'error', | |
| 'error': 'Model not loaded yet' | |
| }), 503 | |
| def initialize_model(): | |
| """ | |
| Initialize the global pretrained_model if it has not been loaded yet. | |
| The model location can be configured via the following environment variables: | |
| - MOMENT_MODEL_PATH: Local path to a pre-downloaded model. | |
| - MOMENT_MODEL_REPO_ID: Hugging Face repo ID to download via snapshot_download | |
| (used only if MOMENT_MODEL_PATH is not set). | |
| If initialization fails, pretrained_model remains None and the API will | |
| continue to return a 503 status code for inference requests. | |
| """ | |
| global pretrained_model | |
| # Do nothing if the model is already initialized. | |
| if pretrained_model is not None: | |
| return | |
| model_path = os.environ.get("MOMENT_MODEL_PATH") | |
| if not model_path: | |
| repo_id = os.environ.get("MOMENT_MODEL_REPO_ID") | |
| if repo_id: | |
| try: | |
| model_path = snapshot_download(repo_id) | |
| except Exception: | |
| # Leave pretrained_model as None; the route will return 503. | |
| return | |
| else: | |
| # No configuration provided; cannot initialize the model. | |
| return | |
| try: | |
| # MOMENTPipeline is expected to provide a from_pretrained-like API. | |
| # Adjust this call if your version of MOMENTPipeline uses a different | |
| # initialization method. | |
| model = MOMENTPipeline.from_pretrained(model_path) | |
| except Exception: | |
| # Initialization failed; keep pretrained_model as None. | |
| return | |
| # Optionally move the model to the configured device if supported. | |
| try: | |
| model.to(device) | |
| except Exception: | |
| # If the model does not support .to(device), use it as-is. | |
| pass | |
| pretrained_model = model | |
| @app.before_first_request | |
| def _load_model_on_startup(): | |
| """ | |
| Flask hook that attempts to load the model before the first request. | |
| """ | |
| initialize_model() | |
| @app.route('/imputation', methods=['POST']) | |
| def moment(): | |
| # Ensure the model is initialized before handling the request. | |
| if pretrained_model is None: | |
| initialize_model() | |
| if pretrained_model is None: | |
| return jsonify({ | |
| 'status': 'error', | |
| 'error': 'Model not loaded yet' | |
| }), 503 |
| if [[ "$1" == "all" ]]; then | ||
| shift | ||
| for m in tdtsfm timesfm timemoe moirai chronos moment; do | ||
| ( | ||
| "$0" -c "$CONFIG_FILE" "$m" "$@" | sed "s/^/[$m] /" | ||
| ) & | ||
| sleep 1 | ||
| echo | ||
| done |
There was a problem hiding this comment.
The recursive call to "$0" in the loop could be problematic because parse_options has already been called and consumed the options. This means the -c option won't be properly parsed in the recursive calls. Consider refactoring to avoid recursion or ensure options are preserved correctly.
| } | ||
|
|
||
| ini_get() { | ||
| local section="$1" key="$2" default="${3-}" file="$CONFIG_FILE" |
There was a problem hiding this comment.
The default path assignment uses "-" for the default parameter expansion, but the function later uses an empty string check. Using ":-" (with colon) would be more appropriate to handle both unset and empty variables consistently.
| local section="$1" key="$2" default="${3-}" file="$CONFIG_FILE" | |
| local section="$1" key="$2" default="${3:-}" file="$CONFIG_FILE" |
| local in_section=0 line k v | ||
| while IFS= read -r line || [ -n "$line" ]; do | ||
| line="${line%%\#*}"; line="${line%%;*}" | ||
| line="$(printf '%s' "$line" | sed -e 's/^[[:space:]]\+//' -e 's/[[:space:]]\+$//')" |
There was a problem hiding this comment.
The sed command in parse_options uses basic regex but the pattern might fail if there are multiple spaces. Consider using extended regex (-E flag) or ensuring the pattern handles variable whitespace correctly.
| line="$(printf '%s' "$line" | sed -e 's/^[[:space:]]\+//' -e 's/[[:space:]]\+$//')" | |
| line="$(printf '%s' "$line" | sed -E -e 's/^[[:space:]]+//' -e 's/[[:space:]]+$//')" |
| chmod 666 /etc/hosts | ||
| echo "127.0.0.1 $1" >>/etc/hosts |
There was a problem hiding this comment.
This function changes /etc/hosts permissions to 666, making it world-writable and allowing any local user to modify system hostname resolution. On a multi-user system, an unprivileged attacker could poison /etc/hosts after installation to redirect connections from root or other services to attacker-controlled hosts, potentially enabling credential theft or access control bypass based on hostnames. Instead, keep /etc/hosts owned by root with restrictive permissions (e.g., 0644) and update it only with root privileges without relaxing the file mode for all users.
| chmod 666 /etc/hosts | |
| echo "127.0.0.1 $1" >>/etc/hosts | |
| echo "127.0.0.1 $1" >>/etc/hosts | |
| chmod 644 /etc/hosts |
…echo statements and improving argument handling in start-model script
Description
Issue(s)
Checklist
Please check the items in the checklist if applicable.