Skip to content

Conversation

@sunlin7
Copy link
Contributor

@sunlin7 sunlin7 commented Aug 4, 2020

Hi,
This change try to rewrite the switch_cut_path() function, for continue search the next sep with preview stopping position, while the original function will always try both two path-seps from beginning.
The new implemention will

  1. fix problem to deal some path like a\\b/c with both back slash and slash.
    $ mkdir a\\b/ && cd a\\b/ && git clone URL_FS
    Original function will give unexpect value b while new implemention will return expect value a\\b.

  2. reduce about 40% calling timing.
    $ gcc -std=c11 test-switch_cut_path.c && ./a.out
    func[0] take time 9 s
    func[1] take time 15 s

The testing code is

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <time.h>
const char *switch_cut_path(const char *in) {
  const char *p, *ret = in;
  const char delims[] = "/\\";
  const char *i;

  if (in) {
    for (i = delims; *i; i++) {
      p = in;
      while ((p = strchr(p, *i)) != 0) {
        ret = ++p;
      }
    }
    return ret;
  } else {
    return NULL;
  }
}

const char *switch_cut_path_new(const char *in) {
  const char *p = in, *ret = in;

  if (NULL == in) {
    return NULL;
  }
  while ((p = strchr(ret, '/')) != NULL) {
    ret = ++p;
  }
  while ((p = strchr(ret, '\\')) != NULL) {
    ret = ++p;
  }
  return ret;
}

int main(int argc, char *argv[])
{
  char seps[] = "/\\";
  char path[] = "abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ";
  const char *(*funcs[])(const char *) = {switch_cut_path_new, switch_cut_path};
  for (int i = 0; i < sizeof(funcs)/sizeof(funcs[0]); i++) {
    time_t t1 = time(NULL);
    for (int j = 0; j < 1024 * 1024 * 200; j++) {
      int idx = j % sizeof(path);
      char orig = path[idx];
      path[idx] = '/';
      funcs[i](path);
      path[idx] = '\\';
      funcs[i](path);
      path[idx] = orig;
    }
    time_t t2 = time(NULL);
    printf("func[%d] take time %ld s\n", i, t2 - t1);
  }
  return 0;
}

@sunlin7
Copy link
Contributor Author

sunlin7 commented Aug 4, 2020

@andywolk Could you help to review this change please? Thanks

@mjerris
Copy link
Collaborator

mjerris commented Aug 4, 2020

This changes behavior of the function when either using mixed / and \ or when using unix delims on windows. If you want to optimize this function the first step would be to do a set of unit tests including those scenarios. Is there a compelling reason to optimize this function, if so please provide some detail of the scenario that has this function being called frequently

@mjerris
Copy link
Collaborator

mjerris commented Aug 4, 2020

any optimization or refactoring type changes should start with a complete set of unit tests to confirm we don’t break anything in the process.

@sunlin7 sunlin7 force-pushed the optimizate-switch_cut_path branch 2 times, most recently from 9f4cbd5 to d9fdc2b Compare August 4, 2020 08:34
@sunlin7 sunlin7 force-pushed the optimizate-switch_cut_path branch from d9fdc2b to c1f6041 Compare August 4, 2020 08:50
@sunlin7
Copy link
Contributor Author

sunlin7 commented Aug 4, 2020

@mjerris I turn on all level log, and the switch_cut_path() will be called for each log printing, so I try to make the calling faster.
The changes has updated for adding test cases in the tests/unit/switch_utils.c.
Please notice that the new implemention will give correct result for path like a\\b/c, while original function maybe has problem.
Here's an example:
$ makedir -p a\\b/ && touch a\\b/c
original switch_cut_path("a\\b/c") return "b/c"
new switch_cut_path("a\\b/c") return "c"

@mjerris
Copy link
Collaborator

mjerris commented Aug 4, 2020

does the new way will break cutting unix style paths on windows?

@sunlin7
Copy link
Contributor Author

sunlin7 commented Aug 4, 2020

Hi @mjerris The updated changes won't break cutting unix style paths on windows. The changes didn't depend on the macro WIN32, always search for both / and \\, ack like it's preview version.

@sunlin7
Copy link
Contributor Author

sunlin7 commented Aug 4, 2020

The unit testing case will verify it's behavior.

fst_check_string_equals(switch_cut_path("switch-cut-path"), "switch-cut-path");
fst_check_string_equals(switch_cut_path("switch/cut-path"), "cut-path");
fst_check_string_equals(switch_cut_path("switch/cut\\path"), "path");
fst_check_string_equals(switch_cut_path("switch\\cut/path"), "path");

@sunlin7 sunlin7 changed the title [switch_utils.c] optimization for switch_cut_path() function. [switch_utils.c] fix switch_cut_path() function for both slash and back slash in path Aug 7, 2020
@sunlin7
Copy link
Contributor Author

sunlin7 commented Aug 7, 2020

@mjerris The new version released days ago, shall we merge this bug fix then it will get wide testing before next release? Thanks

@sunlin7
Copy link
Contributor Author

sunlin7 commented Aug 26, 2020

@mjerris Could you review this bug fix please? It will fix the incorrect path spliting on *nix, and keep behavior on Windows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants