Hopper Architecture80GB HBM3

NVIDIA H100 GPU

From $2.69/hr — billed by the minute

The flagship datacenter GPU for AI training and inference. 4th-gen Tensor Cores with FP8 support, Transformer Engine, and 3.35 TB/s bandwidth. Train and serve the largest LLMs.

View All GPU Pricing
H100: $2.69/hr·Per-minute billing·No commitments
Trusted worldwide

Powering teams that push boundaries

27,000+AI developers
50M+GPU hours served
99.9%Uptime SLA
<90sInstance launch

Trusted by companies including: Tesla, Hugging Face, Kaggle, Zoho, Weights & Biases, upGrad, Saama

Why H100

The fastest GPU for AI training and inference

The H100 delivers breakthrough performance with Hopper architecture, Transformer Engine, and native FP8 support.

Transformer Engine

Dynamic FP8/FP16 mixed precision per layer, per iteration. Near-FP16 quality at half the memory footprint. Automatic — no code changes needed.

3.35 TB/s Memory Bandwidth

64% faster than A100 80GB (3,350 vs 2,039 GB/s). Keeps 528 Tensor Cores saturated. Eliminates memory bottlenecks on large model training.

8-GPU NVLink Scaling

900 GB/s per GPU with 4th-gen NVLink. Up to 8 GPUs per instance with 640GB unified memory — enough for training 70B+ parameter models.

Native FP8 Support

Halve model memory footprint and double inference throughput with hardware-native FP8 compute. First GPU with dedicated FP8 Tensor Core instructions.

Specs

Key specs at a glance

80 GB
VRAM
HBM3 with ECC
3,350 GB/s
Memory Bandwidth
64% faster than A100 80GB
989 TFLOPS
Tensor Performance
FP16 / BF16
640 GB
Multi-GPU Memory
8x H100 via NVLink
Pricing

78% less than AWS — rent a single GPU, not eight

Transparent, per-minute billing with no hidden fees. Pause anytime — only pay for active minutes.

Jarvislabs$2.69/hr
Per minute·Min 1 GPUPre-configured
AWS (p5.48xlarge)~$12.38/GPU/hr
Per second·Min 8 (bundled) GPUs
Azure (ND H100 v5)~$9.74/GPU/hr
Per second·Min 8 (bundled) GPUs
Google Cloud (a3-highgpu)~$5.07/hr
Per second·Min 1 GPU
RunPod (Secure)$3.29–3.89/hr
Per second·Min 1 GPU
Lambda$2.49/hr
Per hour·Min 1 GPU

AWS charges $98.32/hr for 8x H100 (p5.48xlarge). On Jarvislabs, rent 1 GPU for $2.69/hr or 8 for $21.52/hr. Save 78% per GPU-hour vs. AWS.

Use Cases

What developers build on the H100

From training 70B+ LLMs to production inference with FP8.

Train Large Language Models

Train and fine-tune 70B+ models with FP8 mixed precision. 8x H100 handles 400B+ parameter models with FSDP/DeepSpeed. Transformer Engine optimizes precision automatically.

LLaMA 3 70BMixtral 8x22BFalcon 180B

Production LLM Inference

Serve 70B models on a single GPU with FP8 quantization via TensorRT-LLM, vLLM. 2-3x faster than A100. Native FP8 eliminates quantization overhead.

vLLMTGITensorRT-LLM

Image & Video Generation

FLUX, Stable Diffusion, and video models at maximum speed. 80GB fits multiple models simultaneously. 3.35 TB/s bandwidth eliminates pipeline stalls.

FLUXSDXLStable Video Diffusion

Research & Experimentation

Fastest iteration cycles. Train, evaluate, iterate. FP8 training cuts memory in half, Transformer Engine optimizes precision automatically per layer.

PyTorchJAXDeepSpeedMegatron
Full Specs

Technical specifications

Complete hardware specifications for the NVIDIA H100 data center GPU.

Architecture
NVIDIA Hopper
Next-gen AI performance
CUDA Cores
16,896
General-purpose GPU compute
Tensor Cores
528 (4th gen)
FP8/FP16/BF16/TF32/FP64
VRAM
80 GB HBM3 (ECC)
Models up to 70B+ in FP16
Memory Bandwidth
3,350 GB/s
64% faster than A100 80GB
FP32 Performance
67 TFLOPS
Traditional compute
TF32 Tensor
495 TFLOPS
Auto mixed-precision training
FP16/BF16 Tensor
989 TFLOPS
Mixed-precision training
FP8 Tensor
1,979 TFLOPS
Via Transformer Engine
INT8 Tensor
1,979 TOPS
Quantized inference
Transformer Engine
Yes (1st gen FP8/FP16)
Automatic mixed precision
NVLink
900 GB/s bidirectional (4th gen)
Multi-GPU training
PCIe
Gen5 x16
Host data transfer
TDP
700W (SXM)
Maximum performance
Multi-GPU
Up to 8x per instance
640GB unified memory
Get Started

Launch your H100 instance in seconds

Three simple steps from sign-up to a running GPU instance.

01

Choose Template

PyTorch 2.x (with Transformers, DeepSpeed, PEFT, Accelerate), TensorFlow, JAX, or clean CUDA. Transformer Engine pre-configured.

02

Configure & Launch

Select H100, 1–8 GPUs, allocate storage. Templates ready in seconds, VMs in under a minute.

03

Train at Scale

DeepSpeed, FSDP, and Transformer Engine pre-configured for multi-GPU. Pause when idle, resume from checkpoint.

Manage via CLI

Create and manage H100 instances from your terminal.

jl create --gpu H100
Explore CLI
27,343+
AI developers trust Jarvislabs
50M+
GPU hours served
99.9%
Uptime SLA
FAQ

Frequently asked questions

Everything you need to know about renting the NVIDIA H100 on Jarvislabs.

Fine-tune 70B parameters in FP8 on a single GPU. 8x H100 (640GB) handles 180B+ full fine-tuning. Common workloads: LLaMA 3 70B, Mixtral 8x22B, SDXL training.

Start training on the NVIDIA H100 in seconds

$2.69/hr with per-minute billing. 80GB HBM3. Up to 8 GPUs. No commitments.

Compare All GPUs