Skip to content

Conversation

@nkreeger
Copy link
Contributor

@nkreeger nkreeger commented Sep 11, 2019

To see the logs from the Cloud Build CI, please join either our discussion or announcement mailing list.


This change is Reviewable

@googlebot
Copy link

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@dsmilkov dsmilkov added cla: yes and removed cla: no labels Sep 12, 2019
@googlebot
Copy link

A Googler has manually verified that the CLAs look good.

(Googler, please make sure the reason for overriding the CLA status is clearly documented in these comments.)

ℹ️ Googlers: Go here for more info.

@googlebot
Copy link

All (the pull request submitter and all commit authors) CLAs are signed, but one or more commits were authored or co-authored by someone other than the pull request submitter.

We need to confirm that all authors are ok with their commits being contributed to this project. Please have them confirm that by leaving a comment that contains only @googlebot I consent. in this pull request.

Note to project maintainer: There may be cases where the author cannot leave a comment, or the comment is not properly detected as consent. In those cases, you can manually confirm consent of the commit author(s), and set the cla label to yes (if enabled on your project).

ℹ️ Googlers: Go here for more info.

@nkreeger nkreeger changed the title [wip] Refactor WebGLContext instance ownage to a global. Refactor WebGLContext instance ownage to a global. Sep 12, 2019
@nkreeger nkreeger requested review from dsmilkov, nsthorat and tafsiri and removed request for nsthorat September 12, 2019 21:40
@nkreeger
Copy link
Contributor Author

@googlebot I consent.

Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 13 of 19 files at r1, 5 of 8 files at r2, 3 of 4 files at r5.
Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @nkreeger, @nsthorat, and @tafsiri)


tfjs-core/src/backends/webgl/canvas_util.ts, line 38 at r5 (raw file):

}

export function cleanupDOMCanvasWebGLRenderingContext(

for consistency, consider renaming to browserContextCleanup so then it goes well with setContextCleanup(browserContextFactory)


tfjs-core/src/backends/webgl/canvas_util.ts, line 41 at r5 (raw file):

    context: WebGLRenderingContext) {
  if (context == null) {
    throw new Error('Shold not hit this case');

improve the error message. Something like: "Failed to cleanup the active context. The context object passed to the callback is null"


tfjs-core/src/backends/webgl/canvas_util.ts, line 49 at r5 (raw file):

}

export function createDOMCanvasWebGLRenderingContext(webGLVersion: number):

for consistency, consider renaming to browserContextFactory so then it goes well with setContextFactory(browserContextFactory)


tfjs-core/src/backends/webgl/canvas_util.ts, line 56 at r5 (raw file):

  const canvas = createCanvas(webGLVersion);

  canvas.addEventListener('webglcontextlost', (ev: Event) => {

since the cleanup doesn't happen in canvas_util anymore, no need to listen for webglcontextlost event here.


tfjs-core/src/backends/webgl/flags_webgl_test.ts, line 280 at r5 (raw file):

  beforeEach(() => {
    webgl_util.resetMaxTexturesInShader();
    // Forge the max textures flag between different tests.

my typo: /s/forge/forget


tfjs-core/src/backends/webgl/gpgpu_context.ts, line 106 at r5 (raw file):

          'disposing.');
    }
    const debug = true;

any reason why you didn't keep this.debug ? Is there benefit for doing the cleanup calls always in debug mode?


tfjs-core/src/backends/webgl/gpgpu_context.ts, line 122 at r5 (raw file):

    this.throwIfDisposed();
    return gpgpu_util.createFloat32MatrixTexture(
        getActiveContext(), this.debug, rows, columns, this.textureConfig);

What problems do you foresee if we cache the gl inside the GpgpuContext? Because I'm concerned that without caching we now do multiple function calls with a dictionary lookup (via getActiveContext()) whenever we need to create/write/read/dispose a texture which we do in high volume.

If you have strong reasons to avoid caching, we should do quick profiling on existing models, e.g. the facemesh model, to understand if removing caching adds overhead to the run-time.


tfjs-core/src/backends/webgl/gpgpu_context.ts, line 166 at r5 (raw file):

    this.throwIfDisposed();
    return gpgpu_util.createPackedMatrixTexture(
        getActiveContext(), true, rows, columns, this.textureConfig);

this.debug instead of always true?


tfjs-core/src/backends/webgl/gpgpu_context.ts, line 177 at r5 (raw file):

    }
    callAndCheck(
        getActiveContext(), true,

same here, this.debug


tfjs-core/src/backends/webgl/gpgpu_context_test.ts, line 178 at r5 (raw file):

});

describeWithFlags('GPGPUContext dispose', DOWNLOAD_FLOAT_ENVS, () => {

any reason why you moved this in a separate describeWithFlags?


tfjs-core/src/backends/webgl/webgl_context_manager.ts, line 22 at r5 (raw file):

import {checkWebGLError} from './webgl_check';

let count = 0;

is count used anywhere? otherwise let's remove and add on a need-by basis.


tfjs-core/src/backends/webgl/webgl_util.ts, line 188 at r5 (raw file):

    gl: WebGLRenderingContext, debug: boolean): WebGLFramebuffer {
  return throwIfNull<WebGLFramebuffer>(
      gl, true, () => gl.createFramebuffer(),

s/true/debug


tfjs-core/src/backends/webgl/webgl_util.ts, line 513 at r5 (raw file):

export function hasExtension(gl: WebGLRenderingContext, extensionName: string) {
  const ext = gl.getExtension(extensionName);
  try {

Any reason why we check for error? hasExtension sounds like something that should return a boolean without throwing an error.

If you need something that throws an error, consider adding a new method.

Copy link
Contributor

@nsthorat nsthorat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(sorry for the delay, I had these comments last week but Reviewable was down).

One thing that I think is important with this is profiling because this is adding multiple function calls and dictionary lookups for critical code paths, could you look at a real model with this and make sure it's not slower?

Also:
https://console.cloud.google.com/cloud-build/builds/d67592dc-8399-4bdd-9904-dc2e77340258?project=learnjs-174218

Looks like in a webgl1 test we're trying to use a webgl2renderingcontext

Reviewed 13 of 19 files at r1, 5 of 8 files at r2, 3 of 4 files at r5.
Reviewable status: 0 of 1 approvals obtained (waiting on @nkreeger, @nsthorat, and @tafsiri)


tfjs-core/src/backends/webgl/canvas_util_test.ts, line 36 at r5 (raw file):

  it('Returns a valid gl context', () => {
    const gl =
        createDOMCanvasWebGLRenderingContext(ENV.getNumber('WEBGL_VERSION'));

should you be cleaning these up at the end of the test?


tfjs-core/src/backends/webgl/flags_webgl_test.ts, line 56 at r5 (raw file):

    expect(ENV.getBool('WEBGL_RENDER_FLOAT32_CAPABLE')).toBe(true);
    // Undo the forcing of half float.
    delete ENV.getFlags()['WEBGL_FORCE_F16_TEXTURES'];

i dont think you need to do this because it's in a describeWithFlags (which resets flags between tests)


tfjs-core/src/backends/webgl/flags_webgl_test.ts, line 254 at r5 (raw file):

describe('WEBGL_MAX_TEXTURE_SIZE', () => {
  beforeEach(() => {
    webgl_util.resetMaxTextureSize();

do you need to reset the flag here too?


tfjs-core/src/backends/webgl/webgl_context_manager.ts, line 22 at r5 (raw file):

import {checkWebGLError} from './webgl_check';

let count = 0;

maybe something like contextCount?


tfjs-core/src/backends/webgl/webgl_context_manager.ts, line 70 at r5 (raw file):

  if (!(version in contexts)) {
    contexts[version] = contextFactory(version), ++count;

maybe split this onto its own line for readability

Copy link
Contributor

@dsmilkov dsmilkov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @nkreeger, @nsthorat, and @tafsiri)


tfjs-core/src/backends/webgl/flags_webgl_test.ts, line 56 at r5 (raw file):

Previously, nsthorat (Nikhil Thorat) wrote…

i dont think you need to do this because it's in a describeWithFlags (which resets flags between tests)

describeWithFlags only resets flags between describes, not between its for speed. If the flag is not removed there, the next it will fail

Copy link
Contributor

@nsthorat nsthorat left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @nkreeger, and @tafsiri)


tfjs-core/src/backends/webgl/flags_webgl_test.ts, line 56 at r5 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

describeWithFlags only resets flags between describes, not between its for speed. If the flag is not removed there, the next it will fail

Apologies, that's right.

Copy link
Contributor Author

@nkreeger nkreeger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated - merged master into this branch. Still unresolved (sorry I don't have time to dig in here):

  • Speed of dictionary lookup
  • Fixing of flags_webgl_test.ts (I think this crashes the whole unit test - I think @dsmilkov found something on this forcing as webgl1.)

Reviewable status: 0 of 1 approvals obtained (waiting on @dsmilkov, @nkreeger, and @tafsiri)


tfjs-core/src/backends/webgl/gpgpu_context.ts, line 106 at r5 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

any reason why you didn't keep this.debug ? Is there benefit for doing the cleanup calls always in debug mode?

Done.


tfjs-core/src/backends/webgl/gpgpu_context.ts, line 166 at r5 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

this.debug instead of always true?

Done.


tfjs-core/src/backends/webgl/gpgpu_context_test.ts, line 178 at r5 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

any reason why you moved this in a separate describeWithFlags?

I think I didn't want to mess with the GPGPU context disposing in the context of the other tests.


tfjs-core/src/backends/webgl/webgl_util.ts, line 188 at r5 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

s/true/debug

Done.


tfjs-core/src/backends/webgl/webgl_util.ts, line 513 at r5 (raw file):

Previously, dsmilkov (Daniel Smilkov) wrote…

Any reason why we check for error? hasExtension sounds like something that should return a boolean without throwing an error.

If you need something that throws an error, consider adding a new method.

Old debugging stuff - cleaned up (same throughout this)

@nsthorat
Copy link
Contributor

Thanks for updating this PR Nick!

Looks like there are still some failures:
https://pantheon.corp.google.com/cloud-build/builds/b6c14fb0-ffae-448c-a535-7b57a1e81725;step=2?project=learnjs-174218

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants