Skip to content

algattik/blob-index

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Blob index and change feed sample

Getting Started

This sample deploys:

  • an Azure Storage Account with blob indexing and change feed enabled
  • an Azure Event Hub with capture set up to the storage account
  • an Azure Container Instance containing a simulator sending events to the Event Hub

Running the sample causes blobs to be written every few seconds to the storage account.

The sample also contains a .NET application demonstrating the use of the blob change feed processor library to reliably consume the blob change feed.

Prerequisites

Note: you can also use Azure Cloud Shell to avoid having to install software locally.

Installation

  • git clone https://github.com/algattik/blob-index.git

  • cd blob-index

  • Log in with Azure CLI (in Azure Cloud Shell, skip this step):

    az login
  • Run:

    terraform init
    terraform apply

    When prompted, answer yes to deploy the solution.

    Take note of the storage_account_url output shown.

  • Run:

    cd src
    dotnet run https://STORAGE_ACCOUNT_NAME.blob.core.windows.net/

    Replacing the URL with the storage_account_url Terraform output.

Destroying the solution

Run:

terraform destroy

Results

Blob index

The blob index is only populated once every 24 hours. Letting the solution run for a few days shows various inventory listings being written in the container root.

The inventory is a 3-column CSV file. Here are some sample records showing a JSON blob produced by the blob change feed, a CSV blob produced by a previous run of the blob index, and two Avro blobs generated by Event Hubs Capture.

Name Last-Modified Content-Length
$blobchangefeed/idx/segments/1601/01/01/0000/meta.json 2023-09-08T07:41:20Z 453
container1/2023/09/19/04-21-46/rule1/rule1.csv 2023-09-19T04:34:55Z 27410309
container1/evh-blbidx/ingestion/0/2023-09-08/07/51:09.avro 2023-09-08T07:52:09Z 777727
container1/evh-blbidx/ingestion/0/2023-09-08/07/52:09.avro 2023-09-08T07:53:10Z 631471

Blob change feed

Letting the solution run for a few hours before running the application shows events being consumed at a rate of roughly 1200 per second:

New Cursor: {"CursorVersion":1,"UrlHost":"stblbidx4745ae3ace19887e.blob.core.windows.net","EndTime":null,"CurrentSegmentCursor":{"ShardCursors":[{"CurrentChunkPath":"log/00/2023/09/20/0600/00000.avro","BlockOffset":889375,"EventIndex":12}],"CurrentShardPath":"log/00/2023/09/20/0600/","SegmentPath":"idx/segments/2023/09/20/0600/meta.json"}}
Processing [60000 events in batch] [Rate: 1252.0800179297858 events/s]

Event types:
 BlobCreated: 60000

New Cursor: {"CursorVersion":1,"UrlHost":"stblbidx4745ae3ace19887e.blob.core.windows.net","EndTime":null,"CurrentSegmentCursor":{"ShardCursors":[{"CurrentChunkPath":"log/00/2023/09/20/1100/00000.avro","BlockOffset":122892,"EventIndex":32}],"CurrentShardPath":"log/00/2023/09/20/1100/","SegmentPath":"idx/segments/2023/09/20/1100/meta.json"}}
Processing [65000 events in batch] [Rate: 1044.179004859818 events/s]

Manually deleting some blobs from Azure Portal shows other event types:

Processing [12 events in batch] [Rate: 0.1929117534186535 events/s]
Event types:
 BlobCreated: 66609
 BlobDeleted: 2

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published