diff --git "a/README.md" "b/README.md"
--- "a/README.md"
+++ "b/README.md"
@@ -1,667 +1,187 @@
-<!doctype html>
-<html class="">
-	<head>
-		<meta charset="utf-8" />
-		<meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=no" />
-		<meta name="description" content="We’re on a journey to advance and democratize artificial intelligence through open source and open science." />
-		<meta property="fb:app_id" content="1321688464574422" />
-		<meta name="twitter:card" content="summary_large_image" />
-		<meta name="twitter:site" content="@huggingface" />
-		<meta name="twitter:image" content="https://cdn-thumbnails.huggingface.co/social-thumbnails/models/ibm-granite/granite-speech-3.3-8b.png" />
-		<meta property="og:title" content="README.md · ibm-granite/granite-speech-3.3-8b at main" />
-		<meta property="og:type" content="website" />
-		<meta property="og:url" content="https://huggingface.co/ibm-granite/granite-speech-3.3-8b/blob/main/README.md" />
-		<meta property="og:image" content="https://cdn-thumbnails.huggingface.co/social-thumbnails/models/ibm-granite/granite-speech-3.3-8b.png" />
-
-		<link rel="stylesheet" href="/front/build/kube-19a386b/style.css" />
-
-		<link rel="preconnect" href="https://fonts.gstatic.com" />
-		<link
-			href="https://fonts.googleapis.com/css2?family=Source+Sans+Pro:ital,wght@0,200;0,300;0,400;0,600;0,700;1,200;1,300;1,400;1,600;1,700&display=swap"
-			rel="stylesheet"
-		/>
-		<link
-			href="https://fonts.googleapis.com/css2?family=IBM+Plex+Mono:wght@400;600;700&display=swap"
-			rel="stylesheet"
-		/>
-
-		<link
-			rel="preload"
-			href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.12.0/katex.min.css"
-			as="style"
-			onload="this.onload=null;this.rel='stylesheet'"
-		/>
-		<noscript>
-			<link rel="stylesheet" href="https://cdnjs.cloudflare.com/ajax/libs/KaTeX/0.12.0/katex.min.css" />
-		</noscript>
-
-		<script>const guestTheme = document.cookie.match(/theme=(\w+)/)?.[1]; document.documentElement.classList.toggle('dark', guestTheme === 'dark' || ( (!guestTheme || guestTheme === 'system') && window.matchMedia('(prefers-color-scheme: dark)').matches));</script>
-<link rel="canonical" href="https://huggingface.co/ibm-granite/granite-speech-3.3-8b/blob/main/README.md">  
-
-		<title>README.md · ibm-granite/granite-speech-3.3-8b at main</title>
-
-		<script
-			defer
-			data-domain="huggingface.co"
-			event-loggedIn="true"
-			src="/js/script.pageview-props.js"
-		></script>
-		<script>
-			window.plausible =
-				window.plausible ||
-				function () {
-					(window.plausible.q = window.plausible.q || []).push(arguments);
-				};
-		</script>
-		<script>
-			window.hubConfig = {"features":{"signupDisabled":false},"sshGitUrl":"git@hf.co","moonHttpUrl":"https:\/\/huggingface.co","captchaApiKey":"bd5f2066-93dc-4bdd-a64b-a24646ca3859","captchaDisabledOnSignup":true,"datasetViewerPublicUrl":"https:\/\/datasets-server.huggingface.co","stripePublicKey":"pk_live_x2tdjFXBCvXo2FFmMybezpeM00J6gPCAAc","environment":"production","userAgent":"HuggingFace (production)","spacesIframeDomain":"hf.space","spacesApiUrl":"https:\/\/api.hf.space","docSearchKey":"ece5e02e57300e17d152c08056145326e90c4bff3dd07d7d1ae40cf1c8d39cb6","logoDev":{"apiUrl":"https:\/\/img.logo.dev\/","apiKey":"pk_UHS2HZOeRnaSOdDp7jbd5w"}};
-		</script>
-		<script type="text/javascript" src="https://de5282c3ca0c.edge.sdk.awswaf.com/de5282c3ca0c/526cf06acb0d/challenge.js" defer></script> <script type="text/javascript">window.ddjskey = '16C098F83400FA687F6EDB0A51A149'; window.ddoptions = { overrideAbortFetch: false };</script><script type="text/javascript" src="https://dd.huggingface.co/tags.js" async></script>
-	</head>
-	<body class="flex flex-col min-h-dvh bg-white dark:bg-gray-950 text-black ViewerBlobPage">
-		<div class="flex min-h-dvh flex-col"><div class="SVELTE_HYDRATER contents" data-target="SystemThemeMonitor" data-props="{&quot;isLoggedIn&quot;:true}"></div>
-
-	<div class="SVELTE_HYDRATER contents" data-target="MainHeader" data-props="{&quot;authLight&quot;:{&quot;csrfToken&quot;:&quot;eyJkYXRhIjp7ImV4cGlyYXRpb24iOjE3NDU5NTMwNTEyMTksInVzZXJJZCI6IjY2NmVjMzgxMDI3OTFiM2I0OWY0NTNlOCJ9LCJzaWduYXR1cmUiOiI0NDM4MWE2MGRiNmFlZTFmNDY5NjY0OGRmMGViM2UwMGYwNzkyM2YzYWFmMmQxNGU5NTIyYWM3ZmU1YjkyYWEzIn0=&quot;,&quot;hasHfLevelAccess&quot;:false,&quot;u&quot;:{&quot;avatarUrl&quot;:&quot;https://cdn-avatars.huggingface.co/v1/production/uploads/666ec38102791b3b49f453e8/4tbOdqK3C0IVI21cgLUmL.jpeg&quot;,&quot;isPro&quot;:false,&quot;orgs&quot;:[{&quot;avatarUrl&quot;:&quot;https://cdn-avatars.huggingface.co/v1/production/uploads/639bcaa2445b133a4e942436/CEW-OjXkRkDNmTxSu8Egh.png&quot;,&quot;email&quot;:&quot;rpanda@ibm.com&quot;,&quot;fullname&quot;:&quot;IBM Granite&quot;,&quot;name&quot;:&quot;ibm-granite&quot;,&quot;requiresSSO&quot;:false,&quot;isEnterprise&quot;:true,&quot;type&quot;:&quot;org&quot;,&quot;isHf&quot;:false,&quot;numUsers&quot;:63}],&quot;user&quot;:&quot;gsaon&quot;,&quot;canPost&quot;:true,&quot;canHaveBilling&quot;:true,&quot;canCreateOrg&quot;:true,&quot;theme&quot;:&quot;light&quot;,&quot;notifications&quot;:{&quot;org_suggestions&quot;:false},&quot;hardwareItems&quot;:[],&quot;hardwareItemsPrivate&quot;:false,&quot;usage&quot;:{&quot;storage&quot;:{&quot;limit&quot;:100000000000,&quot;used&quot;:0,&quot;count&quot;:0,&quot;usedPrivate&quot;:0,&quot;usedPublic&quot;:0},&quot;inferenceApi&quot;:{&quot;used&quot;:0,&quot;limit&quot;:1000,&quot;duration&quot;:86400,&quot;renewal&quot;:86400,&quot;lastUpdated&quot;:&quot;2025-02-01T05:33:27.801Z&quot;},&quot;zeroGpu&quot;:{&quot;base&quot;:1500,&quot;current&quot;:1500,&quot;lastUpdated&quot;:&quot;2025-04-28T18:57:22.409Z&quot;},&quot;inference&quot;:{&quot;usedNanoUsd&quot;:0,&quot;numRequests&quot;:0,&quot;providerDetails&quot;:[],&quot;periodEnd&quot;:&quot;2025-04-30T23:59:59.999Z&quot;,&quot;periodStart&quot;:&quot;2025-04-01T00:00:00.000Z&quot;,&quot;includedNanoUsd&quot;:100000000,&quot;limitNanoUsd&quot;:100000000,&quot;lastUpdated&quot;:&quot;2025-04-28T18:57:22.409Z&quot;}},&quot;welcomeLinks&quot;:[]}},&quot;classNames&quot;:&quot;&quot;,&quot;avatarUrl&quot;:&quot;https://cdn-avatars.huggingface.co/v1/production/uploads/666ec38102791b3b49f453e8/4tbOdqK3C0IVI21cgLUmL.jpeg&quot;,&quot;isWide&quot;:false,&quot;isZh&quot;:false,&quot;user&quot;:&quot;gsaon&quot;,&quot;unreadNotifications&quot;:27,&quot;csrf&quot;:&quot;eyJkYXRhIjp7ImV4cGlyYXRpb24iOjE3NDU5NTMwNTEyMTksInVzZXJJZCI6IjY2NmVjMzgxMDI3OTFiM2I0OWY0NTNlOCJ9LCJzaWduYXR1cmUiOiI0NDM4MWE2MGRiNmFlZTFmNDY5NjY0OGRmMGViM2UwMGYwNzkyM2YzYWFmMmQxNGU5NTIyYWM3ZmU1YjkyYWEzIn0=&quot;,&quot;canCreateOrg&quot;:true,&quot;isPro&quot;:false}"><header class="border-b border-gray-100 "><div class="w-full px-4 container flex h-16 items-center"><div class="flex flex-1 items-center"><a class="mr-5 flex flex-none items-center lg:mr-6" href="/"><img alt="Hugging Face's logo" class="w-7 md:mr-2" src="/front/assets/huggingface_logo-noborder.svg">
-				<span class="hidden whitespace-nowrap text-lg font-bold md:block">Hugging Face</span></a>
-			<div class="relative flex-1 lg:max-w-sm mr-2 sm:mr-4 md:mr-3 xl:mr-6"><input autocomplete="off" class="w-full dark:bg-gray-950 pl-8 form-input-alt h-9 pr-3 focus:shadow-xl " name="" placeholder="Search models, datasets, users..."   spellcheck="false" type="text" value="">
-	<svg class="absolute left-2.5 text-gray-400 top-1/2 transform -translate-y-1/2" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M30 28.59L22.45 21A11 11 0 1 0 21 22.45L28.59 30zM5 14a9 9 0 1 1 9 9a9 9 0 0 1-9-9z" fill="currentColor"></path></svg>
-	</div>
-			<div class="flex flex-none items-center justify-center p-0.5 place-self-stretch lg:hidden"><button class="relative z-40 flex h-6 w-8 items-center justify-center" type="button"><svg width="1em" height="1em" viewBox="0 0 10 10" class="text-xl" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" preserveAspectRatio="xMidYMid meet" fill="currentColor"><path fill-rule="evenodd" clip-rule="evenodd" d="M1.65039 2.9999C1.65039 2.8066 1.80709 2.6499 2.00039 2.6499H8.00039C8.19369 2.6499 8.35039 2.8066 8.35039 2.9999C8.35039 3.1932 8.19369 3.3499 8.00039 3.3499H2.00039C1.80709 3.3499 1.65039 3.1932 1.65039 2.9999ZM1.65039 4.9999C1.65039 4.8066 1.80709 4.6499 2.00039 4.6499H8.00039C8.19369 4.6499 8.35039 4.8066 8.35039 4.9999C8.35039 5.1932 8.19369 5.3499 8.00039 5.3499H2.00039C1.80709 5.3499 1.65039 5.1932 1.65039 4.9999ZM2.00039 6.6499C1.80709 6.6499 1.65039 6.8066 1.65039 6.9999C1.65039 7.1932 1.80709 7.3499 2.00039 7.3499H8.00039C8.19369 7.3499 8.35039 7.1932 8.35039 6.9999C8.35039 6.8066 8.19369 6.6499 8.00039 6.6499H2.00039Z"></path></svg>
-		<div class="bg-linear-to-br absolute bottom-0.5 right-1.5 z-20 h-2 w-2 rounded-full from-yellow-400 to-orange-500 ring-1 ring-white dark:ring-gray-950"></div></button>
-
-	</div></div>
-		<nav aria-label="Main" class="ml-auto hidden lg:block"><ul class="flex items-center space-x-1.5 2xl:space-x-2"><li class="hover:text-indigo-700"><a class="group flex items-center px-2 py-0.5 dark:text-gray-300 dark:hover:text-gray-100" href="/models"><svg class="mr-1.5 text-gray-400 group-hover:text-indigo-500" style="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24"><path class="uim-quaternary" d="M20.23 7.24L12 12L3.77 7.24a1.98 1.98 0 0 1 .7-.71L11 2.76c.62-.35 1.38-.35 2 0l6.53 3.77c.29.173.531.418.7.71z" opacity=".25" fill="currentColor"></path><path class="uim-tertiary" d="M12 12v9.5a2.09 2.09 0 0 1-.91-.21L4.5 17.48a2.003 2.003 0 0 1-1-1.73v-7.5a2.06 2.06 0 0 1 .27-1.01L12 12z" opacity=".5" fill="currentColor"></path><path class="uim-primary" d="M20.5 8.25v7.5a2.003 2.003 0 0 1-1 1.73l-6.62 3.82c-.275.13-.576.198-.88.2V12l8.23-4.76c.175.308.268.656.27 1.01z" fill="currentColor"></path></svg>
-					Models</a>
-			</li><li class="hover:text-red-700"><a class="group flex items-center px-2 py-0.5 dark:text-gray-300 dark:hover:text-gray-100" href="/datasets"><svg class="mr-1.5 text-gray-400 group-hover:text-red-500" style="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 25 25"><ellipse cx="12.5" cy="5" fill="currentColor" fill-opacity="0.25" rx="7.5" ry="2"></ellipse><path d="M12.5 15C16.6421 15 20 14.1046 20 13V20C20 21.1046 16.6421 22 12.5 22C8.35786 22 5 21.1046 5 20V13C5 14.1046 8.35786 15 12.5 15Z" fill="currentColor" opacity="0.5"></path><path d="M12.5 7C16.6421 7 20 6.10457 20 5V11.5C20 12.6046 16.6421 13.5 12.5 13.5C8.35786 13.5 5 12.6046 5 11.5V5C5 6.10457 8.35786 7 12.5 7Z" fill="currentColor" opacity="0.5"></path><path d="M5.23628 12C5.08204 12.1598 5 12.8273 5 13C5 14.1046 8.35786 15 12.5 15C16.6421 15 20 14.1046 20 13C20 12.8273 19.918 12.1598 19.7637 12C18.9311 12.8626 15.9947 13.5 12.5 13.5C9.0053 13.5 6.06886 12.8626 5.23628 12Z" fill="currentColor"></path></svg>
-					Datasets</a>
-			</li><li class="hover:text-blue-700"><a class="group flex items-center px-2 py-0.5 dark:text-gray-300 dark:hover:text-gray-100" href="/spaces"><svg class="mr-1.5 text-gray-400 group-hover:text-blue-500" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" viewBox="0 0 25 25"><path opacity=".5" d="M6.016 14.674v4.31h4.31v-4.31h-4.31ZM14.674 14.674v4.31h4.31v-4.31h-4.31ZM6.016 6.016v4.31h4.31v-4.31h-4.31Z" fill="currentColor"></path><path opacity=".75" fill-rule="evenodd" clip-rule="evenodd" d="M3 4.914C3 3.857 3.857 3 4.914 3h6.514c.884 0 1.628.6 1.848 1.414a5.171 5.171 0 0 1 7.31 7.31c.815.22 1.414.964 1.414 1.848v6.514A1.914 1.914 0 0 1 20.086 22H4.914A1.914 1.914 0 0 1 3 20.086V4.914Zm3.016 1.102v4.31h4.31v-4.31h-4.31Zm0 12.968v-4.31h4.31v4.31h-4.31Zm8.658 0v-4.31h4.31v4.31h-4.31Zm0-10.813a2.155 2.155 0 1 1 4.31 0 2.155 2.155 0 0 1-4.31 0Z" fill="currentColor"></path><path opacity=".25" d="M16.829 6.016a2.155 2.155 0 1 0 0 4.31 2.155 2.155 0 0 0 0-4.31Z" fill="currentColor"></path></svg>
-					Spaces</a>
-			</li><li class="hover:text-yellow-700 max-xl:hidden"><a class="group flex items-center px-2 py-0.5 dark:text-gray-300 dark:hover:text-gray-100" href="/posts"><svg class="mr-1.5 text-gray-400 group-hover:text-yellow-500 text-yellow-500!" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" viewBox="0 0 12 12" preserveAspectRatio="xMidYMid meet"><path fill="currentColor" fill-rule="evenodd" d="M3.73 2.4A4.25 4.25 0 1 1 6 10.26H2.17l-.13-.02a.43.43 0 0 1-.3-.43l.01-.06a.43.43 0 0 1 .12-.22l.84-.84A4.26 4.26 0 0 1 3.73 2.4Z" clip-rule="evenodd"></path></svg>
-					Posts</a>
-			</li><li class="hover:text-yellow-700"><a class="group flex items-center px-2 py-0.5 dark:text-gray-300 dark:hover:text-gray-100" href="/docs"><svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="mr-1.5 text-gray-400 group-hover:text-yellow-500" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 16 16"><path d="m2.28 3.7-.3.16a.67.67 0 0 0-.34.58v8.73l.01.04.02.07.01.04.03.06.02.04.02.03.04.06.05.05.04.04.06.04.06.04.08.04.08.02h.05l.07.02h.11l.04-.01.07-.02.03-.01.07-.03.22-.12a5.33 5.33 0 0 1 5.15.1.67.67 0 0 0 .66 0 5.33 5.33 0 0 1 5.33 0 .67.67 0 0 0 1-.58V4.36a.67.67 0 0 0-.34-.5l-.3-.17v7.78a.63.63 0 0 1-.87.59 4.9 4.9 0 0 0-4.35.35l-.65.39a.29.29 0 0 1-.15.04.29.29 0 0 1-.16-.04l-.65-.4a4.9 4.9 0 0 0-4.34-.34.63.63 0 0 1-.87-.59V3.7Z" fill="currentColor" class="dark:opacity-40"></path><path fill-rule="evenodd" clip-rule="evenodd" d="M8 3.1a5.99 5.99 0 0 0-5.3-.43.66.66 0 0 0-.42.62v8.18c0 .45.46.76.87.59a4.9 4.9 0 0 1 4.34.35l.65.39c.05.03.1.04.16.04.05 0 .1-.01.15-.04l.65-.4a4.9 4.9 0 0 1 4.35-.34.63.63 0 0 0 .86-.59V3.3a.67.67 0 0 0-.41-.62 5.99 5.99 0 0 0-5.3.43l-.3.17L8 3.1Zm.73 1.87a.43.43 0 1 0-.86 0v5.48a.43.43 0 0 0 .86 0V4.97Z" fill="currentColor" class="opacity-40 dark:opacity-100"></path><path d="M8.73 4.97a.43.43 0 1 0-.86 0v5.48a.43.43 0 1 0 .86 0V4.96Z" fill="currentColor" class="dark:opacity-40"></path></svg>
-					Docs</a>
-			</li><li class="hover:text-black dark:hover:text-white"><a class="group flex items-center px-2 py-0.5 dark:text-gray-300 dark:hover:text-gray-100" href="/enterprise"><svg class="mr-1.5 text-gray-400 group-hover:text-black dark:group-hover:text-white" xmlns="http://www.w3.org/2000/svg" fill="none" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 33 27"><path fill="currentColor" fill-rule="evenodd" d="M13.5.7a8.7 8.7 0 0 0-7.7 5.7L1 20.6c-1 3.1.9 5.7 4.1 5.7h15c3.3 0 6.8-2.6 7.8-5.7l4.6-14.2c1-3.1-.8-5.7-4-5.7h-15Zm1.1 5.7L9.8 20.3h9.8l1-3.1h-5.8l.8-2.5h4.8l1.1-3h-4.8l.8-2.3H23l1-3h-9.5Z" clip-rule="evenodd"></path></svg>
-					Enterprise</a>
-			</li>
-
-		<li><a class="group flex items-center px-2 py-0.5 dark:text-gray-300 dark:hover:text-gray-100" href="/pricing">Pricing
-			</a></li>
-
-		<li><div class="relative group">
-	<button class="px-2 py-0.5 hover:text-gray-500 dark:hover:text-gray-600 flex items-center " type="button">
-		<svg class=" text-gray-500 w-5 group-hover:text-gray-400 dark:text-gray-300 dark:group-hover:text-gray-100" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" viewBox="0 0 32 18" preserveAspectRatio="xMidYMid meet"><path fill-rule="evenodd" clip-rule="evenodd" d="M14.4504 3.30221C14.4504 2.836 14.8284 2.45807 15.2946 2.45807H28.4933C28.9595 2.45807 29.3374 2.836 29.3374 3.30221C29.3374 3.76842 28.9595 4.14635 28.4933 4.14635H15.2946C14.8284 4.14635 14.4504 3.76842 14.4504 3.30221Z" fill="currentColor"></path><path fill-rule="evenodd" clip-rule="evenodd" d="M14.4504 9.00002C14.4504 8.53382 14.8284 8.15588 15.2946 8.15588H28.4933C28.9595 8.15588 29.3374 8.53382 29.3374 9.00002C29.3374 9.46623 28.9595 9.84417 28.4933 9.84417H15.2946C14.8284 9.84417 14.4504 9.46623 14.4504 9.00002Z" fill="currentColor"></path><path fill-rule="evenodd" clip-rule="evenodd" d="M14.4504 14.6978C14.4504 14.2316 14.8284 13.8537 15.2946 13.8537H28.4933C28.9595 13.8537 29.3374 14.2316 29.3374 14.6978C29.3374 15.164 28.9595 15.542 28.4933 15.542H15.2946C14.8284 15.542 14.4504 15.164 14.4504 14.6978Z" fill="currentColor"></path><path fill-rule="evenodd" clip-rule="evenodd" d="M1.94549 6.87377C2.27514 6.54411 2.80962 6.54411 3.13928 6.87377L6.23458 9.96907L9.32988 6.87377C9.65954 6.54411 10.194 6.54411 10.5237 6.87377C10.8533 7.20343 10.8533 7.73791 10.5237 8.06756L6.23458 12.3567L1.94549 8.06756C1.61583 7.73791 1.61583 7.20343 1.94549 6.87377Z" fill="currentColor"></path></svg>
-			
-		</button>
-	
-	
-	</div></li>
-		<li><hr class="h-5 w-0.5 border-none bg-gray-100 dark:bg-gray-800"></li>
-		<li><form action="/logout" method="POST" class="hidden"><input type="hidden" name="csrf" value="eyJkYXRhIjp7ImV4cGlyYXRpb24iOjE3NDU5NTMwNTEyMTksInVzZXJJZCI6IjY2NmVjMzgxMDI3OTFiM2I0OWY0NTNlOCJ9LCJzaWduYXR1cmUiOiI0NDM4MWE2MGRiNmFlZTFmNDY5NjY0OGRmMGViM2UwMGYwNzkyM2YzYWFmMmQxNGU5NTIyYWM3ZmU1YjkyYWEzIn0="></form>
-<div class="relative ml-2 w-[1.38rem] h-[1.38rem] ">
-	<button class="ml-auto rounded-full ring-2 group ring-yellow-500 ring-offset-1 focus:ring-yellow-600 hover:ring-offset-1 focus:ring-offset-1 focus:outline-hidden outline-hidden dark:ring-offset-gray-950 " type="button">
-		
-		<div class="relative"><img alt="" class="h-[1.38rem] w-[1.38rem] overflow-hidden rounded-full" src="https://cdn-avatars.huggingface.co/v1/production/uploads/666ec38102791b3b49f453e8/4tbOdqK3C0IVI21cgLUmL.jpeg" crossorigin="anonymous">
-			<div class="bg-linear-to-br absolute -bottom-0.5 -right-0.5 z-20 h-2 w-2 rounded-full from-yellow-400 to-orange-500 ring-1 ring-white dark:ring-gray-950"></div></div>
-	
-		</button>
-	
-	
-	</div></li></ul></nav></div></header></div>
-	
-	
-	
-	<div class="SVELTE_HYDRATER contents" data-target="SSOBanner" data-props="{&quot;organizations&quot;:[]}"></div>
-	
-	
-
-	<main class="flex flex-1 flex-col">
-	<div class="SVELTE_HYDRATER contents" data-target="ModelHeader" data-props="{&quot;activeTab&quot;:&quot;files&quot;,&quot;authLight&quot;:{&quot;csrfToken&quot;:&quot;eyJkYXRhIjp7ImV4cGlyYXRpb24iOjE3NDU5NTMwNTEyMTksInVzZXJJZCI6IjY2NmVjMzgxMDI3OTFiM2I0OWY0NTNlOCJ9LCJzaWduYXR1cmUiOiI0NDM4MWE2MGRiNmFlZTFmNDY5NjY0OGRmMGViM2UwMGYwNzkyM2YzYWFmMmQxNGU5NTIyYWM3ZmU1YjkyYWEzIn0=&quot;,&quot;hasHfLevelAccess&quot;:false,&quot;u&quot;:{&quot;avatarUrl&quot;:&quot;https://cdn-avatars.huggingface.co/v1/production/uploads/666ec38102791b3b49f453e8/4tbOdqK3C0IVI21cgLUmL.jpeg&quot;,&quot;isPro&quot;:false,&quot;orgs&quot;:[{&quot;avatarUrl&quot;:&quot;https://cdn-avatars.huggingface.co/v1/production/uploads/639bcaa2445b133a4e942436/CEW-OjXkRkDNmTxSu8Egh.png&quot;,&quot;email&quot;:&quot;rpanda@ibm.com&quot;,&quot;fullname&quot;:&quot;IBM Granite&quot;,&quot;name&quot;:&quot;ibm-granite&quot;,&quot;requiresSSO&quot;:false,&quot;isEnterprise&quot;:true,&quot;type&quot;:&quot;org&quot;,&quot;isHf&quot;:false,&quot;numUsers&quot;:63}],&quot;user&quot;:&quot;gsaon&quot;,&quot;canPost&quot;:true,&quot;canHaveBilling&quot;:true,&quot;canCreateOrg&quot;:true,&quot;theme&quot;:&quot;light&quot;,&quot;notifications&quot;:{&quot;org_suggestions&quot;:false},&quot;hardwareItems&quot;:[],&quot;hardwareItemsPrivate&quot;:false,&quot;usage&quot;:{&quot;storage&quot;:{&quot;limit&quot;:100000000000,&quot;used&quot;:0,&quot;count&quot;:0,&quot;usedPrivate&quot;:0,&quot;usedPublic&quot;:0},&quot;inferenceApi&quot;:{&quot;used&quot;:0,&quot;limit&quot;:1000,&quot;duration&quot;:86400,&quot;renewal&quot;:86400,&quot;lastUpdated&quot;:&quot;2025-02-01T05:33:27.801Z&quot;},&quot;zeroGpu&quot;:{&quot;base&quot;:1500,&quot;current&quot;:1500,&quot;lastUpdated&quot;:&quot;2025-04-28T18:57:22.409Z&quot;},&quot;inference&quot;:{&quot;usedNanoUsd&quot;:0,&quot;numRequests&quot;:0,&quot;providerDetails&quot;:[],&quot;periodEnd&quot;:&quot;2025-04-30T23:59:59.999Z&quot;,&quot;periodStart&quot;:&quot;2025-04-01T00:00:00.000Z&quot;,&quot;includedNanoUsd&quot;:100000000,&quot;limitNanoUsd&quot;:100000000,&quot;lastUpdated&quot;:&quot;2025-04-28T18:57:22.409Z&quot;}},&quot;welcomeLinks&quot;:[]}},&quot;author&quot;:{&quot;avatarUrl&quot;:&quot;https://cdn-avatars.huggingface.co/v1/production/uploads/639bcaa2445b133a4e942436/CEW-OjXkRkDNmTxSu8Egh.png&quot;,&quot;fullname&quot;:&quot;IBM Granite&quot;,&quot;name&quot;:&quot;ibm-granite&quot;,&quot;type&quot;:&quot;org&quot;,&quot;isHf&quot;:false,&quot;isHfAdmin&quot;:false,&quot;isMod&quot;:false,&quot;isEnterprise&quot;:true,&quot;followerCount&quot;:1592,&quot;isUserFollowing&quot;:true},&quot;canReadRepoSettings&quot;:true,&quot;canWriteRepoContent&quot;:true,&quot;canDisable&quot;:false,&quot;model&quot;:{&quot;author&quot;:&quot;ibm-granite&quot;,&quot;cardData&quot;:{&quot;license&quot;:&quot;apache-2.0&quot;,&quot;language&quot;:[&quot;en&quot;],&quot;base_model&quot;:[&quot;ibm-granite/granite-3.3-8b-instruct&quot;],&quot;library_name&quot;:&quot;transformers&quot;},&quot;cardExists&quot;:true,&quot;config&quot;:{&quot;architectures&quot;:[&quot;GraniteSpeechForConditionalGeneration&quot;],&quot;model_type&quot;:&quot;granite_speech&quot;,&quot;tokenizer_config&quot;:{&quot;bos_token&quot;:&quot;<|end_of_text|>&quot;,&quot;eos_token&quot;:&quot;<|end_of_text|>&quot;,&quot;pad_token&quot;:&quot;<|end_of_text|>&quot;,&quot;unk_token&quot;:&quot;<|end_of_text|>&quot;}},&quot;createdAt&quot;:&quot;2025-04-14T15:43:11.000Z&quot;,&quot;discussionsDisabled&quot;:false,&quot;downloads&quot;:5532,&quot;downloadsAllTime&quot;:5532,&quot;id&quot;:&quot;ibm-granite/granite-speech-3.3-8b&quot;,&quot;isLikedByUser&quot;:true,&quot;watched&quot;:{&quot;isWatching&quot;:true,&quot;isMuted&quot;:false,&quot;mode&quot;:&quot;global&quot;},&quot;availableInferenceProviders&quot;:[],&quot;inference&quot;:&quot;&quot;,&quot;lastModified&quot;:&quot;2025-04-16T23:04:57.000Z&quot;,&quot;likes&quot;:35,&quot;pipeline_tag&quot;:&quot;automatic-speech-recognition&quot;,&quot;library_name&quot;:&quot;transformers&quot;,&quot;librariesOther&quot;:[],&quot;trackDownloads&quot;:true,&quot;model-index&quot;:null,&quot;private&quot;:false,&quot;repoType&quot;:&quot;model&quot;,&quot;gated&quot;:false,&quot;pwcLink&quot;:{&quot;error&quot;:&quot;Unknown error, can't generate link to Papers With Code.&quot;},&quot;tags&quot;:[&quot;transformers&quot;,&quot;safetensors&quot;,&quot;granite_speech&quot;,&quot;automatic-speech-recognition&quot;,&quot;en&quot;,&quot;base_model:ibm-granite/granite-3.3-8b-instruct&quot;,&quot;base_model:finetune:ibm-granite/granite-3.3-8b-instruct&quot;,&quot;license:apache-2.0&quot;,&quot;endpoints_compatible&quot;,&quot;region:us&quot;],&quot;tag_objs&quot;:[{&quot;id&quot;:&quot;automatic-speech-recognition&quot;,&quot;label&quot;:&quot;Automatic Speech Recognition&quot;,&quot;type&quot;:&quot;pipeline_tag&quot;,&quot;subType&quot;:&quot;audio&quot;},{&quot;id&quot;:&quot;transformers&quot;,&quot;label&quot;:&quot;Transformers&quot;,&quot;type&quot;:&quot;library&quot;},{&quot;id&quot;:&quot;safetensors&quot;,&quot;label&quot;:&quot;Safetensors&quot;,&quot;type&quot;:&quot;library&quot;},{&quot;id&quot;:&quot;en&quot;,&quot;label&quot;:&quot;English&quot;,&quot;type&quot;:&quot;language&quot;},{&quot;id&quot;:&quot;granite_speech&quot;,&quot;label&quot;:&quot;granite_speech&quot;,&quot;type&quot;:&quot;other&quot;},{&quot;id&quot;:&quot;base_model:ibm-granite/granite-3.3-8b-instruct&quot;,&quot;label&quot;:&quot;base_model:ibm-granite/granite-3.3-8b-instruct&quot;,&quot;type&quot;:&quot;other&quot;},{&quot;id&quot;:&quot;base_model:finetune:ibm-granite/granite-3.3-8b-instruct&quot;,&quot;label&quot;:&quot;base_model:finetune:ibm-granite/granite-3.3-8b-instruct&quot;,&quot;type&quot;:&quot;other&quot;},{&quot;id&quot;:&quot;endpoints_compatible&quot;,&quot;label&quot;:&quot;Inference Endpoints&quot;,&quot;type&quot;:&quot;other&quot;},{&quot;id&quot;:&quot;license:apache-2.0&quot;,&quot;label&quot;:&quot;apache-2.0&quot;,&quot;type&quot;:&quot;license&quot;},{&quot;type&quot;:&quot;region&quot;,&quot;label&quot;:&quot;🇺🇸 Region: US&quot;,&quot;id&quot;:&quot;region:us&quot;}],&quot;transformersInfo&quot;:{&quot;auto_model&quot;:&quot;AutoModelForSpeechSeq2Seq&quot;,&quot;pipeline_tag&quot;:&quot;automatic-speech-recognition&quot;,&quot;processor&quot;:&quot;AutoProcessor&quot;},&quot;safetensors&quot;:{&quot;parameters&quot;:{&quot;BF16&quot;:8483514666},&quot;total&quot;:8483514666,&quot;sharded&quot;:true},&quot;resourceGroup&quot;:{&quot;id&quot;:&quot;67e462a47eb0c13242f3198e&quot;,&quot;name&quot;:&quot; granite-speech&quot;,&quot;numUsers&quot;:7},&quot;hasBlockedOids&quot;:false,&quot;region&quot;:&quot;us&quot;,&quot;isQuantized&quot;:false,&quot;xetEnabled&quot;:false},&quot;discussionsStats&quot;:{&quot;closed&quot;:0,&quot;open&quot;:2,&quot;total&quot;:2},&quot;orgs&quot;:[{&quot;avatarUrl&quot;:&quot;https://cdn-avatars.huggingface.co/v1/production/uploads/639bcaa2445b133a4e942436/CEW-OjXkRkDNmTxSu8Egh.png&quot;,&quot;fullname&quot;:&quot;IBM Granite&quot;,&quot;name&quot;:&quot;ibm-granite&quot;,&quot;userRole&quot;:&quot;read&quot;,&quot;type&quot;:&quot;org&quot;,&quot;isHf&quot;:false,&quot;details&quot;:&quot;LLMs for language and code + Time series and geospatial foundation models&quot;,&quot;isEnterprise&quot;:true,&quot;numUsers&quot;:63,&quot;canPay&quot;:false}],&quot;query&quot;:{},&quot;inferenceContextData&quot;:{&quot;billableEntities&quot;:[{&quot;type&quot;:&quot;user&quot;,&quot;name&quot;:&quot;gsaon&quot;,&quot;avatarUrl&quot;:&quot;https://cdn-avatars.huggingface.co/v1/production/uploads/666ec38102791b3b49f453e8/4tbOdqK3C0IVI21cgLUmL.jpeg&quot;,&quot;isPro&quot;:false}],&quot;entityName2Providers&quot;:{}}}"><header class="bg-linear-to-t border-b border-gray-100 pt-6 sm:pt-9 from-gray-50-to-white via-white dark:via-gray-950"><div class="container relative "><h1 class="flex flex-wrap items-center max-md:leading-tight mb-3 text-lg max-sm:gap-y-1.5 md:text-xl">
-			<div class="group flex flex-none items-center"><div class="relative mr-1 flex items-center">
-
-			
-
-<span class="inline-block "><span class="contents"><a href="/ibm-granite" class="text-gray-400 hover:text-blue-600"><img alt="" class="size-3.5 rounded-sm  flex-none" src="https://cdn-avatars.huggingface.co/v1/production/uploads/639bcaa2445b133a4e942436/CEW-OjXkRkDNmTxSu8Egh.png" crossorigin="anonymous"></a></span>
-	</span></div>
-		
-
-<span class="inline-block "><span class="contents"><a href="/ibm-granite" class="text-gray-400 hover:text-blue-600">ibm-granite</a></span>
-	</span>
-		<div class="mx-0.5 text-gray-300">/</div></div>
-
-<div class="max-w-full "><a class="break-words font-mono font-semibold hover:text-blue-600 " href="/ibm-granite/granite-speech-3.3-8b">granite-speech-3.3-8b</a>
-	<button class="relative text-sm mr-4 focus:outline-hidden inline-flex cursor-pointer items-center text-sm  mx-0.5   text-gray-600 " title="Copy model name to clipboard" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg>
-	
-	</button></div>
-			<div class="inline-flex items-center overflow-hidden whitespace-nowrap rounded-md border bg-white text-sm leading-none text-gray-500  mr-2"><button class="relative flex items-center overflow-hidden from-red-50 to-transparent dark:from-red-900 px-1.5 py-1 hover:bg-linear-to-t focus:outline-hidden"  title="Unlike"><svg class="left-1.5 absolute" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32" fill="currentColor"><path d="M22.45,6a5.47,5.47,0,0,1,3.91,1.64,5.7,5.7,0,0,1,0,8L16,26.13,5.64,15.64a5.7,5.7,0,0,1,0-8,5.48,5.48,0,0,1,7.82,0L16,10.24l2.53-2.58A5.44,5.44,0,0,1,22.45,6m0-2a7.47,7.47,0,0,0-5.34,2.24L16,7.36,14.89,6.24a7.49,7.49,0,0,0-10.68,0,7.72,7.72,0,0,0,0,10.82L16,29,27.79,17.06a7.72,7.72,0,0,0,0-10.82A7.49,7.49,0,0,0,22.45,4Z"></path></svg>
-
-		<svg class="absolute text-red-500 origin-center transform transition-transform ease-in
-						
-						left-1.5 " xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32" fill="currentColor"><path d="M22.5,4c-2,0-3.9,0.8-5.3,2.2L16,7.4l-1.1-1.1C12,3.3,7.2,3.3,4.3,6.2c0,0-0.1,0.1-0.1,0.1c-3,3-3,7.8,0,10.8L16,29l11.8-11.9c3-3,3-7.8,0-10.8C26.4,4.8,24.5,4,22.5,4z"></path></svg>
-		<span class="ml-4 pl-0.5 ">like</span></button>
-	<button class="focus:outline-hidden flex items-center border-l px-1.5 py-1 text-gray-400 hover:bg-gray-50 focus:bg-gray-100 dark:hover:bg-gray-900 dark:focus:bg-gray-800" title="See users who liked this repository">35</button></div>
-
-
-
-
-			<div class="relative flex items-center gap-1.5  "><div class="mr-2 inline-flex h-6 items-center overflow-hidden whitespace-nowrap rounded-md border text-sm text-gray-500"><button class="focus:outline-hidden relative flex h-full max-w-56 items-center gap-1.5 overflow-hidden px-1.5 hover:bg-gray-50 focus:bg-gray-100 dark:hover:bg-gray-900 dark:focus:bg-gray-800" type="button" ><div class="flex h-full flex-1 items-center justify-center w-14"><span>Following</span></div>
-		<img alt="" class="rounded-xs size-3 flex-none" src="https://cdn-avatars.huggingface.co/v1/production/uploads/639bcaa2445b133a4e942436/CEW-OjXkRkDNmTxSu8Egh.png">
-		<span class="truncate">IBM Granite</span></button>
-	<button class="focus:outline-hidden flex h-full items-center border-l pl-1.5 pr-1.5 text-gray-400 hover:bg-gray-50 focus:bg-gray-100 dark:hover:bg-gray-900 dark:focus:bg-gray-800" title="Show IBM Granite's followers" type="button">1.59k</button></div>
-
-		</div>
-			<a title="This model belongs to the &quot; granite-speech&quot; resource group" class="inline-flex h-5 items-center rounded-sm border border-gray-200 px-1.5 text-xs leading-none text-gray-500 " href="/ibm-granite/granite-speech-3.3-8b/settings#visibility"> granite-speech</a>
-	</h1>
-		<div class="mb-3 flex flex-wrap md:mb-4"><a class="mb-1 mr-1 md:mb-1.5 md:mr-1.5 rounded-lg" href="/models?pipeline_tag=automatic-speech-recognition"><div class="tag tag-white   "><div class="tag-ico -ml-2 tag-ico-yellow"><svg class="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 18 18"><path fill-rule="evenodd" clip-rule="evenodd" d="M8.38893 3.42133C7.9778 3.14662 7.49446 3 7 3C6.33696 3 5.70108 3.26339 5.23223 3.73223C4.76339 4.20107 4.5 4.83696 4.5 5.5C4.5 5.99445 4.64662 6.4778 4.92133 6.88893C5.19603 7.30005 5.58648 7.62048 6.04329 7.8097C6.50011 7.99892 7.00278 8.04843 7.48773 7.95196C7.97268 7.8555 8.41814 7.6174 8.76777 7.26777C9.1174 6.91814 9.3555 6.47268 9.45197 5.98773C9.54843 5.50277 9.49892 5.00011 9.3097 4.54329C9.12048 4.08648 8.80005 3.69603 8.38893 3.42133ZM5.05551 2.58986C5.63108 2.20527 6.30777 2 7 2C7.92826 2 8.8185 2.36875 9.47488 3.02513C10.1313 3.6815 10.5 4.57174 10.5 5.5C10.5 6.19223 10.2947 6.86892 9.91015 7.4445C9.52556 8.02007 8.97894 8.46867 8.33939 8.73358C7.69985 8.99849 6.99612 9.0678 6.31719 8.93275C5.63825 8.7977 5.01461 8.46436 4.52513 7.97487C4.03564 7.48539 3.7023 6.86175 3.56725 6.18282C3.4322 5.50388 3.50152 4.80015 3.76642 4.16061C4.03133 3.52107 4.47993 2.97444 5.05551 2.58986ZM14.85 9.6425L15.7075 10.5C15.8005 10.5927 15.8743 10.7029 15.9245 10.8242C15.9747 10.9456 16.0004 11.0757 16 11.207V16H2V13.5C2.00106 12.5721 2.37015 11.6824 3.0263 11.0263C3.68244 10.3701 4.57207 10.0011 5.5 10H8.5C9.42793 10.0011 10.3176 10.3701 10.9737 11.0263C11.6299 11.6824 11.9989 12.5721 12 13.5V15H15V11.207L14.143 10.35C13.9426 10.4476 13.7229 10.4989 13.5 10.5C13.2033 10.5 12.9133 10.412 12.6666 10.2472C12.42 10.0824 12.2277 9.84811 12.1142 9.57403C12.0006 9.29994 11.9709 8.99834 12.0288 8.70737C12.0867 8.41639 12.2296 8.14912 12.4393 7.93934C12.6491 7.72956 12.9164 7.5867 13.2074 7.52882C13.4983 7.47094 13.7999 7.50065 14.074 7.61418C14.3481 7.72771 14.5824 7.91997 14.7472 8.16665C14.912 8.41332 15 8.70333 15 9C14.9988 9.22271 14.9475 9.44229 14.85 9.6425ZM3.73311 11.7331C3.26444 12.2018 3.00079 12.8372 3 13.5V15H11V13.5C10.9992 12.8372 10.7356 12.2018 10.2669 11.7331C9.79822 11.2644 9.1628 11.0008 8.5 11H5.5C4.8372 11.0008 4.20178 11.2644 3.73311 11.7331Z" fill="currentColor"></path></svg></div>
+---
+license: apache-2.0
+language:
+- en
+base_model:
+- ibm-granite/granite-3.3-2b-instruct
+library_name: transformers
+---
+# Granite-speech-3.3-2b
 
-	
+**Model Summary:**
+Granite-speech-3.3-2b is a compact and efficient speech-language model, specifically designed for automatic speech recognition (ASR) and automatic speech translation (AST). Granite-speech-3.3-2b uses a two-pass design, unlike integrated models that combine speech and language into a single pass. Initial calls to granite-speech-3.3-2b will transcribe audio files into text. To process the transcribed text using the underlying Granite language model, users must make a second call as each step must be explicitly initiated.
 
-	<span>Automatic Speech Recognition</span>
-	
+The model was trained on a collection of public corpora comprising diverse datasets for ASR and AST as well as synthetic datasets tailored to support the speech translation task. Granite-speech-3.3-2b was trained by modality aligning granite-3.3-2b-instruct (https://huggingface.co/ibm-granite/granite-3.3-2b-instruct) to speech on publicly available open source corpora containing audio inputs and text targets.
 
-	</div></a><a class="mb-1 mr-1 md:mb-1.5 md:mr-1.5 rounded-lg" href="/models?library=transformers"><div class="tag tag-white   "><svg class="text-black inline-block text-sm" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" preserveAspectRatio="xMidYMid meet" width="1em" height="1em" viewBox="0 0 95 88"><path fill="#fff" d="M94.25 70.08a8.28 8.28 0 0 1-.43 6.46 10.57 10.57 0 0 1-3 3.6 25.18 25.18 0 0 1-5.7 3.2 65.74 65.74 0 0 1-7.56 2.65 46.67 46.67 0 0 1-11.42 1.68c-5.42.05-10.09-1.23-13.4-4.5a40.4 40.4 0 0 1-10.14.03c-3.34 3.25-7.99 4.52-13.39 4.47a46.82 46.82 0 0 1-11.43-1.68 66.37 66.37 0 0 1-7.55-2.65c-2.28-.98-4.17-2-5.68-3.2a10.5 10.5 0 0 1-3.02-3.6c-.99-2-1.18-4.3-.42-6.46a8.54 8.54 0 0 1-.33-5.63c.25-.95.66-1.83 1.18-2.61a8.67 8.67 0 0 1 2.1-8.47 8.23 8.23 0 0 1 2.82-2.07 41.75 41.75 0 1 1 81.3-.12 8.27 8.27 0 0 1 3.11 2.19 8.7 8.7 0 0 1 2.1 8.47c.52.78.93 1.66 1.18 2.61a8.61 8.61 0 0 1-.32 5.63Z"></path><path fill="#FFD21E" d="M47.21 76.5a34.75 34.75 0 1 0 0-69.5 34.75 34.75 0 0 0 0 69.5Z"></path><path fill="#FF9D0B" d="M81.96 41.75a34.75 34.75 0 1 0-69.5 0 34.75 34.75 0 0 0 69.5 0Zm-73.5 0a38.75 38.75 0 1 1 77.5 0 38.75 38.75 0 0 1-77.5 0Z"></path><path fill="#3A3B45" d="M58.5 32.3c1.28.44 1.78 3.06 3.07 2.38a5 5 0 1 0-6.76-2.07c.61 1.15 2.55-.72 3.7-.32ZM34.95 32.3c-1.28.44-1.79 3.06-3.07 2.38a5 5 0 1 1 6.76-2.07c-.61 1.15-2.56-.72-3.7-.32Z"></path><path fill="#FF323D" d="M46.96 56.29c9.83 0 13-8.76 13-13.26 0-2.34-1.57-1.6-4.09-.36-2.33 1.15-5.46 2.74-8.9 2.74-7.19 0-13-6.88-13-2.38s3.16 13.26 13 13.26Z"></path><path fill="#3A3B45" fill-rule="evenodd" d="M39.43 54a8.7 8.7 0 0 1 5.3-4.49c.4-.12.81.57 1.24 1.28.4.68.82 1.37 1.24 1.37.45 0 .9-.68 1.33-1.35.45-.7.89-1.38 1.32-1.25a8.61 8.61 0 0 1 5 4.17c3.73-2.94 5.1-7.74 5.1-10.7 0-2.34-1.57-1.6-4.09-.36l-.14.07c-2.31 1.15-5.39 2.67-8.77 2.67s-6.45-1.52-8.77-2.67c-2.6-1.29-4.23-2.1-4.23.29 0 3.05 1.46 8.06 5.47 10.97Z" clip-rule="evenodd"></path><path fill="#FF9D0B" d="M70.71 37a3.25 3.25 0 1 0 0-6.5 3.25 3.25 0 0 0 0 6.5ZM24.21 37a3.25 3.25 0 1 0 0-6.5 3.25 3.25 0 0 0 0 6.5ZM17.52 48c-1.62 0-3.06.66-4.07 1.87a5.97 5.97 0 0 0-1.33 3.76 7.1 7.1 0 0 0-1.94-.3c-1.55 0-2.95.59-3.94 1.66a5.8 5.8 0 0 0-.8 7 5.3 5.3 0 0 0-1.79 2.82c-.24.9-.48 2.8.8 4.74a5.22 5.22 0 0 0-.37 5.02c1.02 2.32 3.57 4.14 8.52 6.1 3.07 1.22 5.89 2 5.91 2.01a44.33 44.33 0 0 0 10.93 1.6c5.86 0 10.05-1.8 12.46-5.34 3.88-5.69 3.33-10.9-1.7-15.92-2.77-2.78-4.62-6.87-5-7.77-.78-2.66-2.84-5.62-6.25-5.62a5.7 5.7 0 0 0-4.6 2.46c-1-1.26-1.98-2.25-2.86-2.82A7.4 7.4 0 0 0 17.52 48Zm0 4c.51 0 1.14.22 1.82.65 2.14 1.36 6.25 8.43 7.76 11.18.5.92 1.37 1.31 2.14 1.31 1.55 0 2.75-1.53.15-3.48-3.92-2.93-2.55-7.72-.68-8.01.08-.02.17-.02.24-.02 1.7 0 2.45 2.93 2.45 2.93s2.2 5.52 5.98 9.3c3.77 3.77 3.97 6.8 1.22 10.83-1.88 2.75-5.47 3.58-9.16 3.58-3.81 0-7.73-.9-9.92-1.46-.11-.03-13.45-3.8-11.76-7 .28-.54.75-.76 1.34-.76 2.38 0 6.7 3.54 8.57 3.54.41 0 .7-.17.83-.6.79-2.85-12.06-4.05-10.98-8.17.2-.73.71-1.02 1.44-1.02 3.14 0 10.2 5.53 11.68 5.53.11 0 .2-.03.24-.1.74-1.2.33-2.04-4.9-5.2-5.21-3.16-8.88-5.06-6.8-7.33.24-.26.58-.38 1-.38 3.17 0 10.66 6.82 10.66 6.82s2.02 2.1 3.25 2.1c.28 0 .52-.1.68-.38.86-1.46-8.06-8.22-8.56-11.01-.34-1.9.24-2.85 1.31-2.85Z"></path><path fill="#FFD21E" d="M38.6 76.69c2.75-4.04 2.55-7.07-1.22-10.84-3.78-3.77-5.98-9.3-5.98-9.3s-.82-3.2-2.69-2.9c-1.87.3-3.24 5.08.68 8.01 3.91 2.93-.78 4.92-2.29 2.17-1.5-2.75-5.62-9.82-7.76-11.18-2.13-1.35-3.63-.6-3.13 2.2.5 2.79 9.43 9.55 8.56 11-.87 1.47-3.93-1.71-3.93-1.71s-9.57-8.71-11.66-6.44c-2.08 2.27 1.59 4.17 6.8 7.33 5.23 3.16 5.64 4 4.9 5.2-.75 1.2-12.28-8.53-13.36-4.4-1.08 4.11 11.77 5.3 10.98 8.15-.8 2.85-9.06-5.38-10.74-2.18-1.7 3.21 11.65 6.98 11.76 7.01 4.3 1.12 15.25 3.49 19.08-2.12Z"></path><path fill="#FF9D0B" d="M77.4 48c1.62 0 3.07.66 4.07 1.87a5.97 5.97 0 0 1 1.33 3.76 7.1 7.1 0 0 1 1.95-.3c1.55 0 2.95.59 3.94 1.66a5.8 5.8 0 0 1 .8 7 5.3 5.3 0 0 1 1.78 2.82c.24.9.48 2.8-.8 4.74a5.22 5.22 0 0 1 .37 5.02c-1.02 2.32-3.57 4.14-8.51 6.1-3.08 1.22-5.9 2-5.92 2.01a44.33 44.33 0 0 1-10.93 1.6c-5.86 0-10.05-1.8-12.46-5.34-3.88-5.69-3.33-10.9 1.7-15.92 2.78-2.78 4.63-6.87 5.01-7.77.78-2.66 2.83-5.62 6.24-5.62a5.7 5.7 0 0 1 4.6 2.46c1-1.26 1.98-2.25 2.87-2.82A7.4 7.4 0 0 1 77.4 48Zm0 4c-.51 0-1.13.22-1.82.65-2.13 1.36-6.25 8.43-7.76 11.18a2.43 2.43 0 0 1-2.14 1.31c-1.54 0-2.75-1.53-.14-3.48 3.91-2.93 2.54-7.72.67-8.01a1.54 1.54 0 0 0-.24-.02c-1.7 0-2.45 2.93-2.45 2.93s-2.2 5.52-5.97 9.3c-3.78 3.77-3.98 6.8-1.22 10.83 1.87 2.75 5.47 3.58 9.15 3.58 3.82 0 7.73-.9 9.93-1.46.1-.03 13.45-3.8 11.76-7-.29-.54-.75-.76-1.34-.76-2.38 0-6.71 3.54-8.57 3.54-.42 0-.71-.17-.83-.6-.8-2.85 12.05-4.05 10.97-8.17-.19-.73-.7-1.02-1.44-1.02-3.14 0-10.2 5.53-11.68 5.53-.1 0-.19-.03-.23-.1-.74-1.2-.34-2.04 4.88-5.2 5.23-3.16 8.9-5.06 6.8-7.33-.23-.26-.57-.38-.98-.38-3.18 0-10.67 6.82-10.67 6.82s-2.02 2.1-3.24 2.1a.74.74 0 0 1-.68-.38c-.87-1.46 8.05-8.22 8.55-11.01.34-1.9-.24-2.85-1.31-2.85Z"></path><path fill="#FFD21E" d="M56.33 76.69c-2.75-4.04-2.56-7.07 1.22-10.84 3.77-3.77 5.97-9.3 5.97-9.3s.82-3.2 2.7-2.9c1.86.3 3.23 5.08-.68 8.01-3.92 2.93.78 4.92 2.28 2.17 1.51-2.75 5.63-9.82 7.76-11.18 2.13-1.35 3.64-.6 3.13 2.2-.5 2.79-9.42 9.55-8.55 11 .86 1.47 3.92-1.71 3.92-1.71s9.58-8.71 11.66-6.44c2.08 2.27-1.58 4.17-6.8 7.33-5.23 3.16-5.63 4-4.9 5.2.75 1.2 12.28-8.53 13.36-4.4 1.08 4.11-11.76 5.3-10.97 8.15.8 2.85 9.05-5.38 10.74-2.18 1.69 3.21-11.65 6.98-11.76 7.01-4.31 1.12-15.26 3.49-19.08-2.12Z"></path></svg>
+We are currently investigating an issue with greedy decoding (```num_beams=1```); the model performs reliably with beam sizes > 1, which we recommend for all use cases. 
+Additionally, the model may occasionally hallucinate on very short audio inputs (<0.1s). These issues are under active investigation, and we will update guidance as fixes become available.
 
-	
+**Evaluations:**
 
-	<span>Transformers</span>
-	
+We evaluated granite-speech-3.3-2b alongside other speech-language models (SLMs) in the less than 8b parameter range as well as dedicated ASR and AST systems on standard benchmarks. The evaluation spanned multiple public benchmarks, with particular emphasis on English ASR tasks while also including AST for En-X translation. 
 
-	</div></a><a class="mb-1 mr-1 md:mb-1.5 md:mr-1.5 rounded-lg" href="/models?library=safetensors"><div class="tag tag-white   "><svg class="text-black inline-block text-sm" viewBox="0 0 57 44" fill="none" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet"><path d="M36.816 20.1474L54.9918 27.4409C55.5142 27.6506 55.9623 28.0112 56.2788 28.4766C56.5954 28.9421 56.7661 29.4913 56.7691 30.0542C56.7722 30.6171 56.6074 31.1682 56.2959 31.637C55.9844 32.1059 55.5402 32.4713 55.0201 32.6866L29.953 43.0646C29.2593 43.3518 28.4799 43.3518 27.7862 43.0646L2.71624 32.6894C2.19613 32.4741 1.75197 32.1087 1.44044 31.6398C1.12892 31.171 0.964165 30.62 0.967204 30.057C0.970244 29.4941 1.14094 28.9449 1.45751 28.4794C1.77408 28.014 2.22216 27.6534 2.74456 27.4437L21.2404 20.0227C22.2997 19.5979 25.6477 20.8441 28.8682 20.8555C32.3096 20.8668 35.6292 19.6715 36.816 20.1474ZM11.3042 30.1119L28.8682 37.3828L46.435 30.1119L28.8682 23.0619L11.3042 30.1119ZM29.9247 0.388251L54.9918 10.4462C55.5142 10.6559 55.9623 11.0165 56.2788 11.482C56.5954 11.9474 56.7661 12.4967 56.7691 13.0596C56.7722 13.6225 56.6074 14.1735 56.2959 14.6424C55.9844 15.1112 55.5402 15.4766 55.0201 15.6919L29.953 26.07C29.2593 26.3572 28.4799 26.3572 27.7862 26.07L2.71624 15.6948C2.19613 15.4795 1.75197 15.1141 1.44044 14.6452C1.12892 14.1763 0.964165 13.6253 0.967204 13.0624C0.970244 12.4995 1.14094 11.9503 1.45751 11.4848C1.77408 11.0193 2.22216 10.6588 2.74456 10.4491L27.8117 0.388251C28.4896 0.1157 29.2467 0.1157 29.9247 0.388251ZM11.3042 13.1172L28.8682 20.3881L46.435 13.1172L28.8682 6.06729L11.3042 13.1172Z" fill="currentColor"></path></svg>
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/666ec38102791b3b49f453e8/7n0soblI3pCISpHbwFHI8.png)
 
-	
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/666ec38102791b3b49f453e8/6m5wBbl2UTM-MWO1-f8Pf.png)
 
-	<span>Safetensors</span>
-	
+![image/png](https://cdn-uploads.huggingface.co/production/uploads/666ec38102791b3b49f453e8/cVzCIuH0x_8W7Pz6QHwE8.png)
 
-	</div></a><a class="mb-1 mr-1 md:mb-1.5 md:mr-1.5 rounded-lg" href="/models?language=en"><div class="tag tag-white   ">
-		<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="text-green-600/80" preserveAspectRatio="xMidYMid meet" width="1em" height="1em" viewBox="0 0 10 10"><path fill-rule="evenodd" clip-rule="evenodd" d="M0.625 5C0.625 6.16032 1.08594 7.27312 1.90641 8.09359C2.72688 8.91406 3.83968 9.375 5 9.375C6.16032 9.375 7.27312 8.91406 8.09359 8.09359C8.91406 7.27312 9.375 6.16032 9.375 5C9.375 3.83968 8.91406 2.72688 8.09359 1.90641C7.27312 1.08594 6.16032 0.625 5 0.625C3.83968 0.625 2.72688 1.08594 1.90641 1.90641C1.08594 2.72688 0.625 3.83968 0.625 5ZM7.64365 7.48027C7.61734 7.50832 7.59054 7.53598 7.56326 7.56326C7.13828 7.98824 6.61864 8.2968 6.0539 8.46842C6.29802 8.11949 6.49498 7.64804 6.63475 7.09483C7.00845 7.18834 7.35014 7.3187 7.64365 7.48027ZM8.10076 6.87776C8.37677 6.42196 8.55005 5.90894 8.60556 5.37499H6.86808C6.85542 5.71597 6.82551 6.04557 6.77971 6.35841C7.25309 6.47355 7.68808 6.6414 8.062 6.85549C8.07497 6.86283 8.08789 6.87025 8.10076 6.87776ZM6.03795 6.22536C6.07708 5.95737 6.1044 5.67232 6.11705 5.37499H3.88295C3.89666 5.69742 3.92764 6.00542 3.9722 6.29287C4.37075 6.21726 4.79213 6.17749 5.224 6.17749C5.50054 6.17749 5.77294 6.19376 6.03795 6.22536ZM4.1261 7.02673C4.34894 7.84835 4.68681 8.375 5 8.375C5.32122 8.375 5.66839 7.82101 5.8908 6.963C5.67389 6.93928 5.45082 6.92699 5.224 6.92699C4.84316 6.92699 4.47332 6.96176 4.1261 7.02673ZM3.39783 7.21853C3.53498 7.71842 3.72038 8.14579 3.9461 8.46842C3.42141 8.30898 2.93566 8.03132 2.52857 7.65192C2.77253 7.48017 3.06711 7.33382 3.39783 7.21853ZM3.23916 6.48077C3.18263 6.13193 3.14625 5.76074 3.13192 5.37499H1.39444C1.4585 5.99112 1.67936 6.57938 2.03393 7.08403C2.3706 6.83531 2.78055 6.63162 3.23916 6.48077ZM1.39444 4.62499H3.13192C3.14615 4.24204 3.18211 3.87344 3.23794 3.52681C2.77814 3.37545 2.36731 3.17096 2.03024 2.92123C1.67783 3.42469 1.45828 4.011 1.39444 4.62499ZM2.5237 2.35262C2.76812 2.52552 3.06373 2.67281 3.39584 2.78875C3.53318 2.28573 3.71928 1.85578 3.9461 1.53158C3.41932 1.69166 2.93178 1.97089 2.5237 2.35262ZM3.97101 3.71489C3.92709 4.00012 3.89654 4.30547 3.88295 4.62499H6.11705C6.10453 4.33057 6.07761 4.04818 6.03909 3.78248C5.77372 3.81417 5.50093 3.83049 5.224 3.83049C4.79169 3.83049 4.3699 3.79065 3.97101 3.71489ZM5.8928 3.04476C5.67527 3.06863 5.45151 3.08099 5.224 3.08099C4.84241 3.08099 4.47186 3.04609 4.12405 2.98086C4.34686 2.1549 4.68584 1.625 5 1.625C5.32218 1.625 5.67048 2.18233 5.8928 3.04476ZM6.78083 3.6493C6.826 3.95984 6.85552 4.28682 6.86808 4.62499H8.60556C8.55029 4.09337 8.37827 3.58251 8.10436 3.1282C8.0903 3.1364 8.07618 3.14449 8.062 3.15249C7.68838 3.36641 7.25378 3.53417 6.78083 3.6493ZM7.64858 2.52499C7.35446 2.68754 7.0117 2.81868 6.63664 2.91268C6.49676 2.35623 6.29913 1.88209 6.0539 1.53158C6.61864 1.7032 7.13828 2.01176 7.56326 2.43674C7.59224 2.46572 7.62068 2.49514 7.64858 2.52499Z" fill="currentColor"></path></svg>
+**Release Date**: April 28, 2025 
 
-	
+**License:** [Apache 2.0](https://www.apache.org/licenses/LICENSE-2.0)
 
-	<span>English</span>
-	
+**Supported Languages:**
+English
 
-	</div></a><a class="mb-1 mr-1 md:mb-1.5 md:mr-1.5 rounded-lg" href="/models?other=granite_speech"><div class="tag tag-white   ">
+**Intended Use:** 
+The model is intended to be used in enterprise applications that involve processing of speech inputs. In particular, the model is well-suited for English speech-to-text and speech translations from English to some major European languages such as French, Spanish, Italian, German, Portuguese as well as Japanese and Mandarin. For tasks that exclusively involve text-based input, we suggest using our Granite large language models, which are optimized for text-only processing and offer superior performance compared to this model.
 
-	
+## Generation:
 
-	<span>granite_speech</span>
-	
+Granite Speech model is supported natively in `transformers` from the `main` branch. Below is a simple example of how to use the `granite-speech-3.3-2b` model.
 
-	</div></a><div class="relative inline-block ">
-	<button class="group mr-1 mb-1 md:mr-1.5 md:mb-1.5  rounded-full rounded-br-none " type="button">
-		<div class="tag tag-white rounded-full  relative rounded-br-none pr-2.5">
-		<svg class="text-xs text-gray-900" width="1em" height="1em" viewBox="0 0 10 10" fill="none" xmlns="http://www.w3.org/2000/svg"><path d="M1.46009 5.0945V6.88125C1.46009 7.25201 1.75937 7.55129 2.13012 7.55129C2.50087 7.55129 2.80016 7.25201 2.80016 6.88125V5.0945C2.80016 4.72375 2.50087 4.42446 2.13012 4.42446C1.75937 4.42446 1.46009 4.72375 1.46009 5.0945ZM4.14022 5.0945V6.88125C4.14022 7.25201 4.4395 7.55129 4.81026 7.55129C5.18101 7.55129 5.48029 7.25201 5.48029 6.88125V5.0945C5.48029 4.72375 5.18101 4.42446 4.81026 4.42446C4.4395 4.42446 4.14022 4.72375 4.14022 5.0945ZM1.23674 9.78473H8.38377C8.75452 9.78473 9.0538 9.48545 9.0538 9.1147C9.0538 8.74395 8.75452 8.44466 8.38377 8.44466H1.23674C0.865993 8.44466 0.566711 8.74395 0.566711 9.1147C0.566711 9.48545 0.865993 9.78473 1.23674 9.78473ZM6.82036 5.0945V6.88125C6.82036 7.25201 7.11964 7.55129 7.49039 7.55129C7.86114 7.55129 8.16042 7.25201 8.16042 6.88125V5.0945C8.16042 4.72375 7.86114 4.42446 7.49039 4.42446C7.11964 4.42446 6.82036 4.72375 6.82036 5.0945ZM4.39484 0.623142L0.865993 2.48137C0.682851 2.57517 0.566711 2.76725 0.566711 2.97273C0.566711 3.28094 0.816857 3.53109 1.12507 3.53109H8.49991C8.80365 3.53109 9.0538 3.28094 9.0538 2.97273C9.0538 2.76725 8.93766 2.57517 8.75452 2.48137L5.22568 0.623142C4.9666 0.484669 4.65391 0.484669 4.39484 0.623142V0.623142Z" fill="currentColor"></path></svg>
+### Usage with `transformers`
 
-	<span class="-mr-1 text-gray-400">License:</span>
+First, make sure to build the latest version of transformers from source:
+```shell
+pip install https://github.com/huggingface/transformers/archive/main.zip torchaudio peft soundfile
+```
 
-	<span>apache-2.0</span>
-	
+Then run the code:
+```python
+import torch
+import torchaudio
+from transformers import AutoProcessor, AutoModelForSpeechSeq2Seq
+from huggingface_hub import hf_hub_download
 
-	<div class="border-br-gray-200 absolute bottom-0.5 right-0.5 h-1 w-1 border-[3px] border-l-transparent border-t-transparent border-b-gray-200 border-r-gray-200 dark:border-b-gray-700 dark:border-r-gray-700"></div></div>
-		
-		</button>
-	
-	
-	</div></div>
+device = "cuda" if torch.cuda.is_available() else "cpu"
 
-		<div class="flex flex-col-reverse lg:flex-row lg:items-center lg:justify-between"><div class="-mb-px flex h-12 items-center overflow-x-auto overflow-y-hidden ">
-	<a class="tab-alternate" href="/ibm-granite/granite-speech-3.3-8b"><svg class="mr-1.5 text-gray-400 flex-none" style="" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24"><path class="uim-quaternary" d="M20.23 7.24L12 12L3.77 7.24a1.98 1.98 0 0 1 .7-.71L11 2.76c.62-.35 1.38-.35 2 0l6.53 3.77c.29.173.531.418.7.71z" opacity=".25" fill="currentColor"></path><path class="uim-tertiary" d="M12 12v9.5a2.09 2.09 0 0 1-.91-.21L4.5 17.48a2.003 2.003 0 0 1-1-1.73v-7.5a2.06 2.06 0 0 1 .27-1.01L12 12z" opacity=".5" fill="currentColor"></path><path class="uim-primary" d="M20.5 8.25v7.5a2.003 2.003 0 0 1-1 1.73l-6.62 3.82c-.275.13-.576.198-.88.2V12l8.23-4.76c.175.308.268.656.27 1.01z" fill="currentColor"></path></svg>
-	Model card
-	
-
-	
-		</a><a class="tab-alternate active" href="/ibm-granite/granite-speech-3.3-8b/tree/main"><svg class="mr-1.5 text-gray-400 flex-none" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24"><path class="uim-tertiary" d="M21 19h-8a1 1 0 0 1 0-2h8a1 1 0 0 1 0 2zm0-4h-8a1 1 0 0 1 0-2h8a1 1 0 0 1 0 2zm0-8h-8a1 1 0 0 1 0-2h8a1 1 0 0 1 0 2zm0 4h-8a1 1 0 0 1 0-2h8a1 1 0 0 1 0 2z" opacity=".5" fill="currentColor"></path><path class="uim-primary" d="M9 19a1 1 0 0 1-1-1V6a1 1 0 0 1 2 0v12a1 1 0 0 1-1 1zm-6-4.333a1 1 0 0 1-.64-1.769L3.438 12l-1.078-.898a1 1 0 0 1 1.28-1.538l2 1.667a1 1 0 0 1 0 1.538l-2 1.667a.999.999 0 0 1-.64.231z" fill="currentColor"></path></svg>
-	<span class="xl:hidden">Files</span>
-		<span class="hidden xl:inline">Files and versions</span>
-	
-
-	
-		</a><a class="tab-alternate" href="/ibm-granite/granite-speech-3.3-8b/discussions"><svg class="mr-1.5 text-gray-400 flex-none" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M20.6081 3C21.7684 3 22.8053 3.49196 23.5284 4.38415C23.9756 4.93678 24.4428 5.82749 24.4808 7.16133C24.9674 7.01707 25.4353 6.93643 25.8725 6.93643C26.9833 6.93643 27.9865 7.37587 28.696 8.17411C29.6075 9.19872 30.0124 10.4579 29.8361 11.7177C29.7523 12.3177 29.5581 12.8555 29.2678 13.3534C29.8798 13.8646 30.3306 14.5763 30.5485 15.4322C30.719 16.1032 30.8939 17.5006 29.9808 18.9403C30.0389 19.0342 30.0934 19.1319 30.1442 19.2318C30.6932 20.3074 30.7283 21.5229 30.2439 22.6548C29.5093 24.3704 27.6841 25.7219 24.1397 27.1727C21.9347 28.0753 19.9174 28.6523 19.8994 28.6575C16.9842 29.4379 14.3477 29.8345 12.0653 29.8345C7.87017 29.8345 4.8668 28.508 3.13831 25.8921C0.356375 21.6797 0.754104 17.8269 4.35369 14.1131C6.34591 12.058 7.67023 9.02782 7.94613 8.36275C8.50224 6.39343 9.97271 4.20438 12.4172 4.20438H12.4179C12.6236 4.20438 12.8314 4.2214 13.0364 4.25468C14.107 4.42854 15.0428 5.06476 15.7115 6.02205C16.4331 5.09583 17.134 4.359 17.7682 3.94323C18.7242 3.31737 19.6794 3 20.6081 3ZM20.6081 5.95917C20.2427 5.95917 19.7963 6.1197 19.3039 6.44225C17.7754 7.44319 14.8258 12.6772 13.7458 14.7131C13.3839 15.3952 12.7655 15.6837 12.2086 15.6837C11.1036 15.6837 10.2408 14.5497 12.1076 13.1085C14.9146 10.9402 13.9299 7.39584 12.5898 7.1776C12.5311 7.16799 12.4731 7.16355 12.4172 7.16355C11.1989 7.16355 10.6615 9.33114 10.6615 9.33114C10.6615 9.33114 9.0863 13.4148 6.38031 16.206C3.67434 18.998 3.5346 21.2388 5.50675 24.2246C6.85185 26.2606 9.42666 26.8753 12.0653 26.8753C14.8021 26.8753 17.6077 26.2139 19.1799 25.793C19.2574 25.7723 28.8193 22.984 27.6081 20.6107C27.4046 20.212 27.0693 20.0522 26.6471 20.0522C24.9416 20.0522 21.8393 22.6726 20.5057 22.6726C20.2076 22.6726 19.9976 22.5416 19.9116 22.222C19.3433 20.1173 28.552 19.2325 27.7758 16.1839C27.639 15.6445 27.2677 15.4256 26.746 15.4263C24.4923 15.4263 19.4358 19.5181 18.3759 19.5181C18.2949 19.5181 18.2368 19.4937 18.2053 19.4419C17.6743 18.557 17.9653 17.9394 21.7082 15.6009C25.4511 13.2617 28.0783 11.8545 26.5841 10.1752C26.4121 9.98141 26.1684 9.8956 25.8725 9.8956C23.6001 9.89634 18.2311 14.9403 18.2311 14.9403C18.2311 14.9403 16.7821 16.496 15.9057 16.496C15.7043 16.496 15.533 16.4139 15.4169 16.2112C14.7956 15.1296 21.1879 10.1286 21.5484 8.06535C21.7928 6.66715 21.3771 5.95917 20.6081 5.95917Z" fill="#FF9D00"></path><path d="M5.50686 24.2246C3.53472 21.2387 3.67446 18.9979 6.38043 16.206C9.08641 13.4147 10.6615 9.33111 10.6615 9.33111C10.6615 9.33111 11.2499 6.95933 12.59 7.17757C13.93 7.39581 14.9139 10.9401 12.1069 13.1084C9.29997 15.276 12.6659 16.7489 13.7459 14.713C14.8258 12.6772 17.7747 7.44316 19.304 6.44221C20.8326 5.44128 21.9089 6.00204 21.5484 8.06532C21.188 10.1286 14.795 15.1295 15.4171 16.2118C16.0391 17.2934 18.2312 14.9402 18.2312 14.9402C18.2312 14.9402 25.0907 8.49588 26.5842 10.1752C28.0776 11.8545 25.4512 13.2616 21.7082 15.6008C17.9646 17.9393 17.6744 18.557 18.2054 19.4418C18.7372 20.3266 26.9998 13.1351 27.7759 16.1838C28.5513 19.2324 19.3434 20.1173 19.9117 22.2219C20.48 24.3274 26.3979 18.2382 27.6082 20.6107C28.8193 22.9839 19.2574 25.7722 19.18 25.7929C16.0914 26.62 8.24723 28.3726 5.50686 24.2246Z" fill="#FFD21E"></path></svg>
-	Community
-	<div class="ml-1.5 flex h-4 min-w-[1rem] items-center justify-center rounded px-1 text-xs leading-none shadow-sm bg-black text-white dark:bg-gray-800 dark:text-gray-200">2</div>
-
-	
-		</a><a class="tab-alternate" href="/ibm-granite/granite-speech-3.3-8b/settings"><svg class="opacity-50 dark:opacity-70 mr-1.5 text-gray-400 flex-none" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 25 25" fill="currentColor"><path d="M13.0101 3C13.7157 3.0078 14.4184 3.09062 15.1077 3.24652C15.2543 3.2797 15.3871 3.35848 15.4874 3.47186C15.5877 3.58523 15.6506 3.72754 15.6672 3.8789L15.8306 5.36678C15.8537 5.57655 15.925 5.77791 16.0389 5.95464C16.1527 6.13137 16.3059 6.27854 16.4861 6.38432C16.6663 6.4901 16.8685 6.55153 17.0764 6.56367C17.2843 6.57581 17.4921 6.53831 17.6831 6.4542L19.0299 5.85495C19.1667 5.79391 19.3187 5.77743 19.4651 5.8078C19.6115 5.83818 19.745 5.9139 19.847 6.02449C20.8199 7.07789 21.5443 8.3412 21.9658 9.71937C22.01 9.8642 22.0087 10.0194 21.962 10.1634C21.9153 10.3074 21.8256 10.4332 21.7053 10.5232L20.5113 11.4158C20.3434 11.5408 20.2069 11.7041 20.1128 11.8925C20.0187 12.0809 19.9697 12.2891 19.9697 12.5003C19.9697 12.7114 20.0187 12.9196 20.1128 13.108C20.2069 13.2964 20.3434 13.4597 20.5113 13.5848L21.7062 14.4763C21.8269 14.5663 21.917 14.6922 21.9638 14.8364C22.0107 14.9806 22.0121 15.1361 21.9677 15.2812C21.546 16.6593 20.8216 17.9225 19.849 18.976C19.7471 19.0864 19.6141 19.162 19.4681 19.1926C19.3221 19.2231 19.1704 19.207 19.0338 19.1466L17.6812 18.5454C17.4904 18.4606 17.2827 18.4225 17.0748 18.4343C16.8668 18.446 16.6645 18.5072 16.4842 18.6129C16.3039 18.7185 16.1508 18.8658 16.037 19.0426C15.9233 19.2195 15.8523 19.421 15.8297 19.6308L15.6672 21.1177C15.6508 21.2674 15.5892 21.4084 15.4908 21.5212C15.3923 21.6341 15.2619 21.7133 15.1173 21.7482C13.7249 22.0839 12.2742 22.0839 10.8817 21.7482C10.7371 21.7133 10.6067 21.6341 10.5083 21.5212C10.4098 21.4084 10.3482 21.2674 10.3318 21.1177L10.1703 19.6328C10.1468 19.4235 10.0751 19.2227 9.96107 19.0465C9.84703 18.8704 9.69381 18.7238 9.51373 18.6186C9.33364 18.5134 9.13172 18.4525 8.92419 18.4408C8.71666 18.4291 8.50931 18.4669 8.31882 18.5512L6.9672 19.1514C6.83048 19.2121 6.67854 19.2283 6.53235 19.1978C6.38616 19.1672 6.25292 19.0915 6.15103 18.9809C5.17789 17.9263 4.45346 16.6616 4.03227 15.2821C3.98795 15.1371 3.98931 14.9816 4.03617 14.8374C4.08304 14.6931 4.17306 14.5673 4.29375 14.4773L5.48868 13.5848C5.65676 13.4599 5.79345 13.2966 5.88768 13.1082C5.9819 12.9198 6.031 12.7115 6.031 12.5003C6.031 12.289 5.9819 12.0808 5.88768 11.8923C5.79345 11.7039 5.65676 11.5407 5.48868 11.4158L4.29375 10.5252C4.17324 10.4351 4.0834 10.3092 4.03671 10.1649C3.99003 10.0207 3.98881 9.8653 4.03323 9.72034C4.45479 8.34219 5.17922 7.07889 6.15199 6.02547C6.25407 5.91487 6.38753 5.83915 6.53391 5.80878C6.6803 5.77841 6.83238 5.79488 6.96912 5.85593L8.31498 6.45517C8.5063 6.53923 8.71441 6.57664 8.92258 6.56439C9.13075 6.55214 9.33319 6.49057 9.51363 6.38462C9.69406 6.27868 9.84747 6.13132 9.96152 5.95438C10.0756 5.77744 10.1471 5.57585 10.1703 5.36581L10.3338 3.8789C10.3503 3.72724 10.4132 3.58462 10.5137 3.47103C10.6142 3.35745 10.7473 3.2786 10.8942 3.24555C11.5835 3.09062 12.2881 3.00877 13.0101 3ZM12.9986 9.57711C12.2337 9.57711 11.5001 9.88508 10.9593 10.4333C10.4184 10.9815 10.1146 11.725 10.1146 12.5003C10.1146 13.2755 10.4184 14.0191 10.9593 14.5672C11.5001 15.1154 12.2337 15.4234 12.9986 15.4234C13.7634 15.4234 14.497 15.1154 15.0378 14.5672C15.5787 14.0191 15.8825 13.2755 15.8825 12.5003C15.8825 11.725 15.5787 10.9815 15.0378 10.4333C14.497 9.88508 13.7634 9.57711 12.9986 9.57711Z"></path></svg>
-	Settings
-	
-
-	
-		</a></div>
-	
-			
-
-
-<div class="relative mb-1.5 flex flex-wrap gap-1.5 sm:flex-nowrap lg:mb-0"><div class="order-last sm:order-first"><div class="relative ">
-	<button class="btn px-1.5 py-1.5 " type="button">
-		
-			<svg xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" class="p-0.5" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><circle cx="16" cy="7" r="3" fill="currentColor"></circle><circle cx="16" cy="16" r="3" fill="currentColor"></circle><circle cx="16" cy="25" r="3" fill="currentColor"></circle></svg>
-		
-		</button>
-	
-	
-	</div></div>
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-
-	<div class="flex-none w-full sm:w-auto"><div class="relative ">
-	<button class="text-sm btn btn w-full cursor-pointer text-sm" type="button">
-		<svg class="mr-1.5 " xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M12.1 2a9.8 9.8 0 0 0-5.4 1.6l6.4 6.4a2.1 2.1 0 0 1 .2 3a2.1 2.1 0 0 1-3-.2L3.7 6.4A9.84 9.84 0 0 0 2 12.1a10.14 10.14 0 0 0 10.1 10.1a10.9 10.9 0 0 0 2.6-.3l6.7 6.7a5 5 0 0 0 7.1-7.1l-6.7-6.7a10.9 10.9 0 0 0 .3-2.6A10 10 0 0 0 12.1 2zm8 10.1a7.61 7.61 0 0 1-.3 2.1l-.3 1.1l.8.8l6.7 6.7a2.88 2.88 0 0 1 .9 2.1A2.72 2.72 0 0 1 27 27a2.9 2.9 0 0 1-4.2 0l-6.7-6.7l-.8-.8l-1.1.3a7.61 7.61 0 0 1-2.1.3a8.27 8.27 0 0 1-5.7-2.3A7.63 7.63 0 0 1 4 12.1a8.33 8.33 0 0 1 .3-2.2l4.4 4.4a4.14 4.14 0 0 0 5.9.2a4.14 4.14 0 0 0-.2-5.9L10 4.2a6.45 6.45 0 0 1 2-.3a8.27 8.27 0 0 1 5.7 2.3a8.49 8.49 0 0 1 2.4 5.9z" fill="currentColor"></path></svg>
-			Train
-		<svg class="-mr-1 text-gray-500" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24"><path d="M16.293 9.293L12 13.586L7.707 9.293l-1.414 1.414L12 16.414l5.707-5.707z" fill="currentColor"></path></svg></button>
-	
-	
-	</div>
-		
-
-
-		
-
-</div>
-		<div class="flex-none w-full sm:w-auto"><div class="relative ">
-	<button class="text-sm btn btn w-full cursor-pointer text-sm" type="button">
-		<svg class="mr-1.5 " xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><rect x="6.34" y="19" width="11.31" height="2" transform="translate(-10.63 14.34) rotate(-45)"></rect><path d="M17,30a1,1,0,0,1-.37-.07,1,1,0,0,1-.62-.79l-1-7,2-.28.75,5.27L21,24.52V17a1,1,0,0,1,.29-.71l4.07-4.07A8.94,8.94,0,0,0,28,5.86V4H26.14a8.94,8.94,0,0,0-6.36,2.64l-4.07,4.07A1,1,0,0,1,15,11H7.48L4.87,14.26l5.27.75-.28,2-7-1a1,1,0,0,1-.79-.62,1,1,0,0,1,.15-1l4-5A1,1,0,0,1,7,9h7.59l3.77-3.78A10.92,10.92,0,0,1,26.14,2H28a2,2,0,0,1,2,2V5.86a10.92,10.92,0,0,1-3.22,7.78L23,17.41V25a1,1,0,0,1-.38.78l-5,4A1,1,0,0,1,17,30Z"></path></svg>
-			Deploy
-		<svg class="-mr-1 text-gray-500" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24"><path d="M16.293 9.293L12 13.586L7.707 9.293l-1.414 1.414L12 16.414l5.707-5.707z" fill="currentColor"></path></svg></button>
-	
-	
-	</div>
-
-		
-
-
-
-
-
-		
-
-
-
-		
-		
-
-</div>
-		
-
-<div class="relative flex-auto sm:flex-none">
-	<button class="from-gray-800! to-black! text-white! gap-1! border-gray-800! dark:border-gray-900!  btn w-full cursor-pointer text-sm" type="button">
-		<svg class="mr-1.5 mr-0.5! " xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path fill="currentColor" d="M28 4H4a2 2 0 0 0-2 2v16a2 2 0 0 0 2 2h8v4H8v2h16v-2h-4v-4h8a2 2 0 0 0 2-2V6a2 2 0 0 0-2-2ZM18 28h-4v-4h4Zm10-6H4V6h24Z"></path></svg>
-			Use this model
-		<svg class="-mr-1 text-gray-500" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24"><path d="M16.293 9.293L12 13.586L7.707 9.293l-1.414 1.414L12 16.414l5.707-5.707z" fill="currentColor"></path></svg></button>
-	
-	
-	</div>
-
-</div>
-	</div></div></header>
-
-
-</div>
-	
-<div class="container relative flex flex-col md:grid md:space-y-0 w-full md:grid-cols-12  space-y-4 md:gap-6 mb-16"><section class="pt-8 border-gray-100 col-span-full"><div class="SVELTE_HYDRATER contents" data-target="ViewerHeader" data-props="{&quot;authLight&quot;:{&quot;csrfToken&quot;:&quot;eyJkYXRhIjp7ImV4cGlyYXRpb24iOjE3NDU5NTMwNTEyMTksInVzZXJJZCI6IjY2NmVjMzgxMDI3OTFiM2I0OWY0NTNlOCJ9LCJzaWduYXR1cmUiOiI0NDM4MWE2MGRiNmFlZTFmNDY5NjY0OGRmMGViM2UwMGYwNzkyM2YzYWFmMmQxNGU5NTIyYWM3ZmU1YjkyYWEzIn0=&quot;,&quot;hasHfLevelAccess&quot;:false,&quot;u&quot;:{&quot;avatarUrl&quot;:&quot;https://cdn-avatars.huggingface.co/v1/production/uploads/666ec38102791b3b49f453e8/4tbOdqK3C0IVI21cgLUmL.jpeg&quot;,&quot;isPro&quot;:false,&quot;orgs&quot;:[{&quot;avatarUrl&quot;:&quot;https://cdn-avatars.huggingface.co/v1/production/uploads/639bcaa2445b133a4e942436/CEW-OjXkRkDNmTxSu8Egh.png&quot;,&quot;email&quot;:&quot;rpanda@ibm.com&quot;,&quot;fullname&quot;:&quot;IBM Granite&quot;,&quot;name&quot;:&quot;ibm-granite&quot;,&quot;requiresSSO&quot;:false,&quot;isEnterprise&quot;:true,&quot;type&quot;:&quot;org&quot;,&quot;isHf&quot;:false,&quot;numUsers&quot;:63}],&quot;user&quot;:&quot;gsaon&quot;,&quot;canPost&quot;:true,&quot;canHaveBilling&quot;:true,&quot;canCreateOrg&quot;:true,&quot;theme&quot;:&quot;light&quot;,&quot;notifications&quot;:{&quot;org_suggestions&quot;:false},&quot;hardwareItems&quot;:[],&quot;hardwareItemsPrivate&quot;:false,&quot;usage&quot;:{&quot;storage&quot;:{&quot;limit&quot;:100000000000,&quot;used&quot;:0,&quot;count&quot;:0,&quot;usedPrivate&quot;:0,&quot;usedPublic&quot;:0},&quot;inferenceApi&quot;:{&quot;used&quot;:0,&quot;limit&quot;:1000,&quot;duration&quot;:86400,&quot;renewal&quot;:86400,&quot;lastUpdated&quot;:&quot;2025-02-01T05:33:27.801Z&quot;},&quot;zeroGpu&quot;:{&quot;base&quot;:1500,&quot;current&quot;:1500,&quot;lastUpdated&quot;:&quot;2025-04-28T18:57:22.409Z&quot;},&quot;inference&quot;:{&quot;usedNanoUsd&quot;:0,&quot;numRequests&quot;:0,&quot;providerDetails&quot;:[],&quot;periodEnd&quot;:&quot;2025-04-30T23:59:59.999Z&quot;,&quot;periodStart&quot;:&quot;2025-04-01T00:00:00.000Z&quot;,&quot;includedNanoUsd&quot;:100000000,&quot;limitNanoUsd&quot;:100000000,&quot;lastUpdated&quot;:&quot;2025-04-28T18:57:22.409Z&quot;}},&quot;welcomeLinks&quot;:[]}},&quot;context&quot;:{&quot;repo&quot;:{&quot;name&quot;:&quot;ibm-granite/granite-speech-3.3-8b&quot;,&quot;type&quot;:&quot;model&quot;},&quot;rev&quot;:&quot;main&quot;,&quot;path&quot;:&quot;README.md&quot;,&quot;subpaths&quot;:[{&quot;dir&quot;:&quot;README.md&quot;}]},&quot;refs&quot;:{&quot;branches&quot;:[{&quot;name&quot;:&quot;main&quot;,&quot;ref&quot;:&quot;refs/heads/main&quot;,&quot;targetCommit&quot;:&quot;df28cad3b5d02158098eec0d6a2b209aedddf4eb&quot;}],&quot;tags&quot;:[],&quot;converts&quot;:[]},&quot;view&quot;:&quot;blob&quot;}"><header class="flex flex-wrap items-center justify-start pb-2 md:justify-end lg:flex-nowrap"><div class="grow max-md:flex max-md:w-full max-md:items-start max-md:justify-between"><div class="relative mr-4 flex min-w-0 basis-auto flex-wrap items-center md:grow md:basis-full lg:basis-auto lg:flex-nowrap"><div class="relative mr-3 mb-2">
-	<button class="text-sm md:text-base btn w-full cursor-pointer text-sm" type="button">
-		<svg class="mr-1.5 text-gray-700 dark:text-gray-400" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24" style="transform: rotate(360deg);"><path d="M13 14c-3.36 0-4.46 1.35-4.82 2.24C9.25 16.7 10 17.76 10 19a3 3 0 0 1-3 3a3 3 0 0 1-3-3c0-1.31.83-2.42 2-2.83V7.83A2.99 2.99 0 0 1 4 5a3 3 0 0 1 3-3a3 3 0 0 1 3 3c0 1.31-.83 2.42-2 2.83v5.29c.88-.65 2.16-1.12 4-1.12c2.67 0 3.56-1.34 3.85-2.23A3.006 3.006 0 0 1 14 7a3 3 0 0 1 3-3a3 3 0 0 1 3 3c0 1.34-.88 2.5-2.09 2.86C17.65 11.29 16.68 14 13 14m-6 4a1 1 0 0 0-1 1a1 1 0 0 0 1 1a1 1 0 0 0 1-1a1 1 0 0 0-1-1M7 4a1 1 0 0 0-1 1a1 1 0 0 0 1 1a1 1 0 0 0 1-1a1 1 0 0 0-1-1m10 2a1 1 0 0 0-1 1a1 1 0 0 0 1 1a1 1 0 0 0 1-1a1 1 0 0 0-1-1z" fill="currentColor"></path></svg>
-			main
-		<svg class="-mr-1 text-gray-500" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 24 24"><path d="M16.293 9.293L12 13.586L7.707 9.293l-1.414 1.414L12 16.414l5.707-5.707z" fill="currentColor"></path></svg></button>
-	
-	
-	</div>
-			<div class="relative mb-2 flex flex-wrap items-center"><a class="truncate text-gray-800 hover:underline" href="/ibm-granite/granite-speech-3.3-8b/tree/main">granite-speech-3.3-8b</a>
-				<span class="mx-1 text-gray-300">/</span>
-					<span class="dark:text-gray-300">README.md</span>
-					<button class="relative text-xs ml-2 focus:outline-hidden inline-flex cursor-pointer items-center text-sm  mx-0.5   text-gray-600 " title="Copy path" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg>
-	
-	</button></div></div>
-		</div>
-	
-	</header></div>
-			<div class="SVELTE_HYDRATER contents" data-target="LastCommit" data-props="{&quot;commitLast&quot;:{&quot;date&quot;:&quot;2025-04-16T23:04:57.000Z&quot;,&quot;verified&quot;:&quot;verified&quot;,&quot;subject&quot;:&quot;Update README.md&quot;,&quot;authors&quot;:[{&quot;_id&quot;:&quot;666ec38102791b3b49f453e8&quot;,&quot;avatar&quot;:&quot;https://cdn-avatars.huggingface.co/v1/production/uploads/666ec38102791b3b49f453e8/4tbOdqK3C0IVI21cgLUmL.jpeg&quot;,&quot;isHf&quot;:false,&quot;user&quot;:&quot;gsaon&quot;}],&quot;commit&quot;:{&quot;id&quot;:&quot;df28cad3b5d02158098eec0d6a2b209aedddf4eb&quot;,&quot;parentIds&quot;:[&quot;6d954fd30c0dc6c872af37e996eb526f7a68f384&quot;]},&quot;title&quot;:&quot;Update README.md&quot;},&quot;repo&quot;:{&quot;name&quot;:&quot;ibm-granite/granite-speech-3.3-8b&quot;,&quot;type&quot;:&quot;model&quot;}}"><div class="from-gray-100-to-white bg-linear-to-t flex flex-wrap items-baseline rounded-t-lg border border-b-0 px-3 py-2 dark:border-gray-800"><img class="mr-2.5 mt-0.5 h-4 w-4 self-center rounded-full" alt="gsaon's picture" src="https://cdn-avatars.huggingface.co/v1/production/uploads/666ec38102791b3b49f453e8/4tbOdqK3C0IVI21cgLUmL.jpeg">
-			<div class="mr-4 flex flex-none items-center truncate"><a class="hover:underline" href="/gsaon">gsaon
-					</a>
-				
-			</div>
-		<div class="mr-4 truncate font-mono text-sm text-gray-500 hover:prose-a:underline"><!-- HTML_TAG_START -->Update README.md<!-- HTML_TAG_END --></div>
-		<a class="rounded-sm border bg-gray-50 px-1.5 text-sm hover:underline dark:border-gray-800 dark:bg-gray-900" href="/ibm-granite/granite-speech-3.3-8b/commit/df28cad3b5d02158098eec0d6a2b209aedddf4eb">df28cad</a>
-		<span class="mx-2 text-green-500 dark:text-green-600 px-1.5 border-green-100 dark:border-green-800 rounded-full border text-xs uppercase" title="This commit is signed and the signature is verified">verified</span>
-		<time class="ml-auto hidden flex-none truncate pl-2 text-gray-500 dark:text-gray-400 lg:block" datetime="2025-04-16T23:04:57" title="Wed, 16 Apr 2025 23:04:57 GMT">12 days ago</time></div></div>
-			<div class="relative flex flex-wrap items-center border px-3 py-1.5 text-sm text-gray-800 dark:border-gray-800 dark:bg-gray-900 "><div class="flex items-center gap-3 text-sm font-medium"><a class="rounded-md px-1.5 capitalize bg-gray-200 dark:bg-gray-800" href="/ibm-granite/granite-speech-3.3-8b/blob/main/README.md">preview</a>
-						<a class="rounded-md px-1.5 capitalize " href="/ibm-granite/granite-speech-3.3-8b/blob/main/README.md?code=true">code</a></div>
-					<div class="mx-4 text-gray-200">|</div>
-				<a class="my-1 mr-4 flex items-center hover:underline " href="/ibm-granite/granite-speech-3.3-8b/raw/main/README.md"><svg class="mr-1.5" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32" style="transform: rotate(360deg);"><path d="M31 16l-7 7l-1.41-1.41L28.17 16l-5.58-5.59L24 9l7 7z" fill="currentColor"></path><path d="M1 16l7-7l1.41 1.41L3.83 16l5.58 5.59L8 23l-7-7z" fill="currentColor"></path><path d="M12.419 25.484L17.639 6l1.932.518L14.35 26z" fill="currentColor"></path></svg>
-							raw
-						</a><div class="SVELTE_HYDRATER contents" data-target="CopyButton" data-props="{&quot;value&quot;:&quot;https://huggingface.co/ibm-granite/granite-speech-3.3-8b/resolve/main/README.md&quot;,&quot;style&quot;:&quot;blank&quot;,&quot;label&quot;:&quot;Copy download link&quot;,&quot;classNames&quot;:&quot;my-1 mr-4 flex items-center no-underline hover:underline&quot;}"><button class="relative my-1 mr-4 flex items-center no-underline hover:underline       " title="Copy download link" type="button"><svg class="" xmlns="http://www.w3.org/2000/svg" aria-hidden="true" fill="currentColor" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M28,10V28H10V10H28m0-2H10a2,2,0,0,0-2,2V28a2,2,0,0,0,2,2H28a2,2,0,0,0,2-2V10a2,2,0,0,0-2-2Z" transform="translate(0)"></path><path d="M4,18H2V4A2,2,0,0,1,4,2H18V4H4Z" transform="translate(0)"></path><rect fill="none" width="32" height="32"></rect></svg>
-	<span class="ml-1.5 ">Copy download link</span>
-	</button></div><a class="my-1 mr-4 flex items-center hover:underline " href="/ibm-granite/granite-speech-3.3-8b/commits/main/README.md"><svg class="mr-1.5" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32" style="transform: rotate(360deg);"><path d="M16 4C9.383 4 4 9.383 4 16s5.383 12 12 12s12-5.383 12-12S22.617 4 16 4zm0 2c5.535 0 10 4.465 10 10s-4.465 10-10 10S6 21.535 6 16S10.465 6 16 6zm-1 2v9h7v-2h-5V8z" fill="currentColor"></path></svg>
-							history
-						</a><a class="my-1 mr-4 flex items-center hover:underline " href="/ibm-granite/granite-speech-3.3-8b/blame/main/README.md"><svg class="mr-1.5" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32" style="transform: rotate(360deg);"><path d="M16 2a14 14 0 1 0 14 14A14 14 0 0 0 16 2zm0 26a12 12 0 1 1 12-12a12 12 0 0 1-12 12z" fill="currentColor"></path><path d="M11.5 11a2.5 2.5 0 1 0 2.5 2.5a2.48 2.48 0 0 0-2.5-2.5z" fill="currentColor"></path><path d="M20.5 11a2.5 2.5 0 1 0 2.5 2.5a2.48 2.48 0 0 0-2.5-2.5z" fill="currentColor"></path></svg>
-							blame
-						</a><a class="my-1 mr-4 flex items-center hover:underline " href="/ibm-granite/granite-speech-3.3-8b/edit/main/README.md"><svg class="mr-1.5" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M2 26h28v2H2z" fill="currentColor"></path><path d="M25.4 9c.8-.8.8-2 0-2.8l-3.6-3.6c-.8-.8-2-.8-2.8 0l-15 15V24h6.4l15-15zm-5-5L24 7.6l-3 3L17.4 7l3-3zM6 22v-3.6l10-10l3.6 3.6l-10 10H6z" fill="currentColor"></path></svg>
-							edit
-						</a><a class="my-1 mr-4 flex items-center hover:underline " href="/ibm-granite/granite-speech-3.3-8b/delete/main/README.md"><svg class="mr-1.5" xmlns="http://www.w3.org/2000/svg" xmlns:xlink="http://www.w3.org/1999/xlink" aria-hidden="true" focusable="false" role="img" width="1em" height="1em" preserveAspectRatio="xMidYMid meet" viewBox="0 0 32 32"><path d="M12 12h2v12h-2z" fill="currentColor"></path><path d="M18 12h2v12h-2z" fill="currentColor"></path><path d="M4 6v2h2v20a2 2 0 0 0 2 2h16a2 2 0 0 0 2-2V8h2V6zm4 22V8h16v20z" fill="currentColor"></path><path d="M12 2h8v2h-8z" fill="currentColor"></path></svg>
-							delete
-						</a>
-
-				<div class="mr-4 flex items-center"><div class="SVELTE_HYDRATER contents" data-target="ScanStatusBadge" data-props="{&quot;classNames&quot;:&quot;mr-2&quot;,&quot;scanStatus&quot;:{&quot;status&quot;:&quot;safe&quot;,&quot;protectAiScan&quot;:{&quot;status&quot;:&quot;unscanned&quot;,&quot;message&quot;:null,&quot;reportLink&quot;:&quot;https://protectai.com/insights/models/ibm-granite/granite-speech-3.3-8b/df28cad3b5d02158098eec0d6a2b209aedddf4eb/files?blob-id=ae5e2b1490c149dc89074632be2b881b79a46941&amp;utm_source=huggingface&quot;},&quot;avScan&quot;:{&quot;status&quot;:&quot;safe&quot;},&quot;pickleImportScan&quot;:{&quot;status&quot;:&quot;unscanned&quot;,&quot;pickleImports&quot;:[]},&quot;jFrogScan&quot;:{&quot;status&quot;:&quot;unscanned&quot;,&quot;message&quot;:&quot;Not a machine-learning model&quot;,&quot;reportLink&quot;:&quot;&quot;,&quot;reportLabel&quot;:&quot;&quot;}},&quot;repo&quot;:{&quot;name&quot;:&quot;ibm-granite/granite-speech-3.3-8b&quot;,&quot;type&quot;:&quot;model&quot;},&quot;revision&quot;:&quot;main&quot;,&quot;filePath&quot;:&quot;README.md&quot;,&quot;openByDefault&quot;:false}"><div class="sm:relative mr-2"><button class="flex h-[1.125rem] select-none items-center gap-0.5 rounded border pl-0.5 pr-0.5 text-xs leading-tight text-gray-400 hover:cursor-pointer text-gray-400 hover:border-gray-200 hover:bg-gray-50 hover:text-gray-500 dark:border-gray-800 dark:hover:bg-gray-800 dark:hover:text-gray-200 "><svg class="flex-none" width="1em" height="1em" viewBox="0 0 22 28" fill="none" xmlns="http://www.w3.org/2000/svg"><path fill-rule="evenodd" clip-rule="evenodd" d="M15.3634 10.3639C15.8486 10.8491 15.8486 11.6357 15.3634 12.1209L10.9292 16.5551C10.6058 16.8785 10.0814 16.8785 9.7579 16.5551L7.03051 13.8277C6.54532 13.3425 6.54532 12.5558 7.03051 12.0707C7.51569 11.5855 8.30234 11.5855 8.78752 12.0707L9.7579 13.041C10.0814 13.3645 10.6058 13.3645 10.9292 13.041L13.6064 10.3639C14.0916 9.8787 14.8782 9.8787 15.3634 10.3639Z" fill="currentColor"></path><path fill-rule="evenodd" clip-rule="evenodd" d="M10.6666 27.12C4.93329 25.28 0 19.2267 0 12.7867V6.52001C0 5.40001 0.693334 4.41334 1.73333 4.01334L9.73333 1.01334C10.3333 0.786673 11 0.786673 11.6 1.02667L19.6 4.02667C20.1083 4.21658 20.5465 4.55701 20.8562 5.00252C21.1659 5.44803 21.3324 5.97742 21.3333 6.52001V12.7867C21.3333 19.24 16.4 25.28 10.6666 27.12Z" fill="currentColor" fill-opacity="0.22"></path><path d="M10.0845 1.94967L10.0867 1.94881C10.4587 1.8083 10.8666 1.81036 11.2286 1.95515L11.2387 1.95919L11.2489 1.963L19.2489 4.963L19.25 4.96342C19.5677 5.08211 19.8416 5.29488 20.0351 5.57333C20.2285 5.85151 20.3326 6.18203 20.3333 6.52082C20.3333 6.52113 20.3333 6.52144 20.3333 6.52176L20.3333 12.7867C20.3333 18.6535 15.8922 24.2319 10.6666 26.0652C5.44153 24.2316 1 18.6409 1 12.7867V6.52001C1 5.82357 1.42893 5.20343 2.08883 4.94803L10.0845 1.94967Z" stroke="currentColor" stroke-opacity="0.30" stroke-width="2"></path></svg>
-
-			<span class="mr-0.5 max-sm:hidden">Safe</span></button>
-
-	</div></div>
-						</div>
-
-				<div class="flex items-center gap-x-3 dark:text-gray-300 sm:ml-auto">
-					10.9 kB</div></div>
-
-			<div class="relative min-h-[100px] rounded-b-lg border border-t-0 leading-tight dark:border-gray-800 dark:bg-gray-925">
-				
-						<div class="py-4 px-4 sm:px-6 prose hf-sanitized hf-sanitized-C20EHM3Jo2pXskidpOAFF"><div class="not-prose bg-linear-to-t -mx-6 -mt-4 mb-8 max-h-[300px] min-w-full overflow-auto border-b from-gray-50 px-6 pb-5 pt-4 font-mono text-xs transition-all dark:from-gray-900 dark:to-gray-950"><div class="mb-2 inline-block rounded-lg border px-2 py-1 font-mono text-xs leading-none">metadata</div>
-			<pre><!-- HTML_TAG_START --><span class="hljs-attr">license:</span> <span class="hljs-string">apache-2.0</span>
-<span class="hljs-attr">language:</span>
-  <span class="hljs-bullet">-</span> <span class="hljs-string">en</span>
-<span class="hljs-attr">base_model:</span>
-  <span class="hljs-bullet">-</span> <span class="hljs-string">ibm-granite/granite-3.3-8b-instruct</span>
-<span class="hljs-attr">library_name:</span> <span class="hljs-string">transformers</span>
-<!-- HTML_TAG_END --></pre></div>
-	<!-- HTML_TAG_START --><h1 class="relative group flex items-center">
-	<a rel="nofollow" href="#granite-speech-33-8b" class="block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full" id="granite-speech-33-8b">
-		<span class="header-link"><svg viewBox="0 0 256 256" preserveAspectRatio="xMidYMid meet" height="1em" width="1em" role="img" aria-hidden="true" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/2000/svg" class="text-gray-500 hover:text-black dark:hover:text-gray-200 w-4"><path fill="currentColor" d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z"></path></svg></span>
-	</a>
-	<span>
-		Granite-speech-3.3-8b
-	</span>
-</h1>
-<p><strong>Model Summary:</strong>
-Granite-speech-3.3-8b is a compact and efficient speech-language model, specifically designed for automatic speech recognition (ASR) and automatic speech translation (AST). Granite-speech-3.3-8b uses a two-pass design, unlike integrated models that combine speech and language into a single pass. Initial calls to granite-speech-3.3-8b will transcribe audio files into text. To process the transcribed text using the underlying Granite language model, users must make a second call as each step must be explicitly initiated.</p>
-<p>The model was trained on a collection of public corpora comprising diverse datasets for ASR and AST as well as synthetic datasets tailored to support the speech translation task. Granite-speech-3.3 was trained by modality aligning granite-3.3-8b-instruct (<a rel="nofollow" href="https://huggingface.co/ibm-granite/granite-3.3-8b-instruct">https://huggingface.co/ibm-granite/granite-3.3-8b-instruct</a>) to speech on publicly available open source corpora containing audio inputs and text targets.</p>
-<p>We are currently investigating an issue with greedy decoding (<code>num_beams=1</code>); the model performs reliably with beam sizes &gt; 1, which we recommend for all use cases. 
-Additionally, the model may occasionally hallucinate on very short audio inputs (&lt;0.1s). These issues are under active investigation, and we will update guidance as fixes become available.</p>
-<p><strong>Evaluations:</strong></p>
-<p>We evaluated granite-speech-3.3-8b alongside other speech-language models (SLMs) in the less than 8b parameter range as well as dedicated ASR and AST systems on standard benchmarks. The evaluation spanned multiple public benchmarks, with particular emphasis on English ASR tasks while also including AST for En-X translation. </p>
-<p><a rel="nofollow" href="https://cdn-uploads.huggingface.co/production/uploads/666ec38102791b3b49f453e8/79lNwtsiIFaX7mqKKBwCy.png"><img alt="image/png" src="https://cdn-uploads.huggingface.co/production/uploads/666ec38102791b3b49f453e8/79lNwtsiIFaX7mqKKBwCy.png"></a></p>
-<p><a rel="nofollow" href="https://cdn-uploads.huggingface.co/production/uploads/666ec38102791b3b49f453e8/Sl3YmG326Rh08pJ0sagey.png"><img alt="image/png" src="https://cdn-uploads.huggingface.co/production/uploads/666ec38102791b3b49f453e8/Sl3YmG326Rh08pJ0sagey.png"></a></p>
-<p><a rel="nofollow" href="https://cdn-uploads.huggingface.co/production/uploads/666ec38102791b3b49f453e8/2kIgpCMSsxQg-rEHPnDW1.png"><img alt="image/png" src="https://cdn-uploads.huggingface.co/production/uploads/666ec38102791b3b49f453e8/2kIgpCMSsxQg-rEHPnDW1.png"></a></p>
-<p><strong>Release Date</strong>: April 15, 2025 </p>
-<p><strong>License:</strong> <a rel="nofollow" href="https://www.apache.org/licenses/LICENSE-2.0">Apache 2.0</a></p>
-<p><strong>Supported Languages:</strong>
-English</p>
-<p><strong>Intended Use:</strong> 
-The model is intended to be used in enterprise applications that involve processing of speech inputs. In particular, the model is well-suited for English speech-to-text and speech translations from English to some major European languages such as French, Spanish, Italian, German, Portuguese as well as Japanese and Mandarin. For tasks that exclusively involve text-based input, we suggest using our Granite large language models, which are optimized for text-only processing and offer superior performance compared to this model.</p>
-<h2 class="relative group flex items-center">
-	<a rel="nofollow" href="#generation" class="block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full" id="generation">
-		<span class="header-link"><svg viewBox="0 0 256 256" preserveAspectRatio="xMidYMid meet" height="1em" width="1em" role="img" aria-hidden="true" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/2000/svg" class="text-gray-500 hover:text-black dark:hover:text-gray-200 w-4"><path fill="currentColor" d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z"></path></svg></span>
-	</a>
-	<span>
-		Generation:
-	</span>
-</h2>
-<p>Granite Speech model is supported natively in <code>transformers</code> from the <code>main</code> branch. Below is a simple example of how to use the <code>granite-speech-3.3-8b</code> model.</p>
-<h3 class="relative group flex items-center">
-	<a rel="nofollow" href="#usage-with-transformers" class="block pr-1.5 text-lg md:absolute md:p-1.5 md:opacity-0 md:group-hover:opacity-100 md:right-full" id="usage-with-transformers">
-		<span class="header-link"><svg viewBox="0 0 256 256" preserveAspectRatio="xMidYMid meet" height="1em" width="1em" role="img" aria-hidden="true" xmlns:xlink="http://www.w3.org/1999/xlink" xmlns="http://www.w3.org/2000/svg" class="text-gray-500 hover:text-black dark:hover:text-gray-200 w-4"><path fill="currentColor" d="M167.594 88.393a8.001 8.001 0 0 1 0 11.314l-67.882 67.882a8 8 0 1 1-11.314-11.315l67.882-67.881a8.003 8.003 0 0 1 11.314 0zm-28.287 84.86l-28.284 28.284a40 40 0 0 1-56.567-56.567l28.284-28.284a8 8 0 0 0-11.315-11.315l-28.284 28.284a56 56 0 0 0 79.196 79.197l28.285-28.285a8 8 0 1 0-11.315-11.314zM212.852 43.14a56.002 56.002 0 0 0-79.196 0l-28.284 28.284a8 8 0 1 0 11.314 11.314l28.284-28.284a40 40 0 0 1 56.568 56.567l-28.285 28.285a8 8 0 0 0 11.315 11.314l28.284-28.284a56.065 56.065 0 0 0 0-79.196z"></path></svg></span>
-	</a>
-	<span>
-		Usage with <code>transformers</code>
-	</span>
-</h3>
-<p>First, make sure to build the latest version of transformers from source:</p>
-<pre><code class="language-shell">pip install https://github.com/huggingface/transformers/archive/main.zip torchaudio peft soundfile
-</code></pre>
-<p>Then run the code:</p>
-<pre><code class="language-python"><span class="hljs-keyword">import</span> torch
-<span class="hljs-keyword">import</span> torchaudio
-<span class="hljs-keyword">from</span> transformers <span class="hljs-keyword">import</span> AutoProcessor, AutoModelForSpeechSeq2Seq
-<span class="hljs-keyword">from</span> huggingface_hub <span class="hljs-keyword">import</span> hf_hub_download
-
-device = <span class="hljs-string">"cuda"</span> <span class="hljs-keyword">if</span> torch.cuda.is_available() <span class="hljs-keyword">else</span> <span class="hljs-string">"cpu"</span>
-
-model_name = <span class="hljs-string">"ibm-granite/granite-speech-3.3-8b"</span>
+model_name = "ibm-granite/granite-speech-3.3-2b"
 speech_granite_processor = AutoProcessor.from_pretrained(
     model_name)
 tokenizer = speech_granite_processor.tokenizer
 speech_granite = AutoModelForSpeechSeq2Seq.from_pretrained(
     model_name).to(device)
 
-<span class="hljs-comment"># prepare speech and text prompt, using the appropriate prompt template</span>
+# prepare speech and text prompt, using the appropriate prompt template
 
-audio_path = hf_hub_download(repo_id=model_name, filename=<span class="hljs-string">'10226_10111_000000.wav'</span>)
-wav, sr = torchaudio.load(audio_path, normalize=<span class="hljs-literal">True</span>)
-<span class="hljs-keyword">assert</span> wav.shape[<span class="hljs-number">0</span>] == <span class="hljs-number">1</span> <span class="hljs-keyword">and</span> sr == <span class="hljs-number">16000</span> <span class="hljs-comment"># mono, 16khz</span>
+audio_path = hf_hub_download(repo_id=model_name, filename='10226_10111_000000.wav')
+wav, sr = torchaudio.load(audio_path, normalize=True)
+assert wav.shape[0] == 1 and sr == 16000 # mono, 16khz
 
-<span class="hljs-comment"># create text prompt</span>
+# create text prompt
 chat = [
     {
-        <span class="hljs-string">"role"</span>: <span class="hljs-string">"system"</span>,
-        <span class="hljs-string">"content"</span>: <span class="hljs-string">"Knowledge Cutoff Date: April 2024.\nToday's Date: April 9, 2025.\nYou are Granite, developed by IBM. You are a helpful AI assistant"</span>,
+        "role": "system",
+        "content": "Knowledge Cutoff Date: April 2024.\nToday's Date: April 28, 2025.\nYou are Granite, developed by IBM. You are a helpful AI assistant",
     },
     {
-        <span class="hljs-string">"role"</span>: <span class="hljs-string">"user"</span>,
-        <span class="hljs-string">"content"</span>: <span class="hljs-string">"&lt;|audio|&gt;can you transcribe the speech into a written format?"</span>,
+        "role": "user",
+        "content": "<|audio|>can you transcribe the speech into a written format?",
     }
 ]
 
 text = tokenizer.apply_chat_template(
-    chat, tokenize=<span class="hljs-literal">False</span>, add_generation_prompt=<span class="hljs-literal">True</span>
+    chat, tokenize=False, add_generation_prompt=True
 )
 
-<span class="hljs-comment"># compute audio embeddings</span>
+# compute audio embeddings
 model_inputs = speech_granite_processor(
     text,
     wav,
-    device=device, <span class="hljs-comment"># Computation device; returned tensors are put on CPU</span>
-    return_tensors=<span class="hljs-string">"pt"</span>,
+    device=device, # Computation device; returned tensors are put on CPU
+    return_tensors="pt",
 ).to(device)
  
 model_outputs = speech_granite.generate(
     **model_inputs,
-    max_new_tokens=<span class="hljs-number">200</span>,
-    num_beams=<span class="hljs-number">4</span>,
-    do_sample=<span class="hljs-literal">False</span>,
-    min_length=<span class="hljs-number">1</span>,
-    top_p=<span class="hljs-number">1.0</span>,
-    repetition_penalty=<span class="hljs-number">1.0</span>,
-    length_penalty=<span class="hljs-number">1.0</span>,
-    temperature=<span class="hljs-number">1.0</span>,
+    max_new_tokens=200,
+    num_beams=4,
+    do_sample=False,
+    min_length=1,
+    top_p=1.0,
+    repetition_penalty=1.0,
+    length_penalty=1.0,
+    temperature=1.0,
     bos_token_id=tokenizer.bos_token_id,
     eos_token_id=tokenizer.eos_token_id,
     pad_token_id=tokenizer.pad_token_id,
 )
 
-<span class="hljs-comment"># Transformers includes the input IDs in the response.</span>
-num_input_tokens = model_inputs[<span class="hljs-string">"input_ids"</span>].shape[-<span class="hljs-number">1</span>]
-new_tokens = torch.unsqueeze(model_outputs[<span class="hljs-number">0</span>, num_input_tokens:], dim=<span class="hljs-number">0</span>)
+# Transformers includes the input IDs in the response.
+num_input_tokens = model_inputs["input_ids"].shape[-1]
+new_tokens = torch.unsqueeze(model_outputs[0, num_input_tokens:], dim=0)
 
 output_text = tokenizer.batch_decode(
-    new_tokens, add_special_tokens=<span class="hljs-literal">False</span>, skip_special_tokens=<span class="hljs-literal">True</span>
+    new_tokens, add_special_tokens=False, skip_special_tokens=True
 )
-<span class="hljs-built_in">print</span>(<span class="hljs-string">f"STT output = <span class="hljs-subst">{output_text[<span class="hljs-number">0</span>].upper()}</span>"</span>)
-</code></pre>
-<p><strong>Model Architecture:</strong> </p>
-<p>The architecture of granite-speech-3.3-8b consists of the following components:</p>
-<p>(1) Speech encoder: 10 conformer blocks trained with Connectionist Temporal Classification (CTC) on character-level targets on the subset containing
+print(f"STT output = {output_text[0].upper()}")
+```
+
+**Model Architecture:** 
+
+The architecture of granite-speech-3.3-2b consists of the following components:
+
+(1) Speech encoder: 10 conformer blocks trained with Connectionist Temporal Classification (CTC) on character-level targets on the subset containing
 only ASR corpora (see configuration below). In addition, our CTC encoder uses block-attention with 4-seconds audio blocks and self-conditioned CTC
-from the middle layer.</p>
-<div class="max-w-full overflow-auto">
-	<table>
-		<thead><tr>
-<th>Configuration parameter</th>
-<th>Value</th>
-</tr>
-
-		</thead><tbody><tr>
-<td>Input dimension</td>
-<td>160 (80 logmels x 2)</td>
-</tr>
-<tr>
-<td>Nb. of layers</td>
-<td>10</td>
-</tr>
-<tr>
-<td>Hidden dimension</td>
-<td>1024</td>
-</tr>
-<tr>
-<td>Nb. of attention heads</td>
-<td>8</td>
-</tr>
-<tr>
-<td>Attention head size</td>
-<td>128</td>
-</tr>
-<tr>
-<td>Convolution kernel size</td>
-<td>15</td>
-</tr>
-<tr>
-<td>Output dimension</td>
-<td>42</td>
-</tr>
-</tbody>
-	</table>
-</div>
-<p>(2) Speech projector and temporal downsampler (speech-text modality adapter): we use a 2-layer window query transformer (q-former) operating on
+from the middle layer.
+
+| Configuration parameter  | Value                | 
+|-----------------|----------------------|
+| Input dimension | 160 (80 logmels x 2) | 
+| Nb. of layers   | 10                   | 
+| Hidden dimension | 1024                | 
+| Nb. of attention heads | 8             | 
+| Attention head size    | 128           | 
+| Convolution kernel size | 15           | 
+| Output dimension        | 42           | 
+
+(2) Speech projector and temporal downsampler (speech-text modality adapter): we use a 2-layer window query transformer (q-former) operating on
 blocks of 15 1024-dimensional acoustic embeddings coming out of the last conformer block of the speech encoder that get downsampled by a factor of 5
 using 3 trainable queries per block and per layer. The total temporal downsampling factor is 10 (2x from the encoder and 5x from the projector)
 resulting in a 10Hz acoustic embeddings rate for the LLM. The encoder, projector and LoRA adapters were fine-tuned/trained jointly on all the
-corpora mentioned under <strong>Training Data</strong>.</p>
-<p>(3) Large language model: granite-3.3-8b-instruct with 128k context length (<a rel="nofollow" href="https://huggingface.co/ibm-granite/granite-3.3-8b-instruct">https://huggingface.co/ibm-granite/granite-3.3-8b-instruct</a>).</p>
-<p>(4) LoRA adapters: rank=64 applied to the query, value projection matrices</p>
-<p><strong>Training Data:</strong> </p>
-<p>Overall, our training data is largely comprised of two key sources: (1) publicly available datasets (2) Synthetic data created from publicly
+corpora mentioned under **Training Data**.
+
+(3) Large language model: granite-3.3-2b-instruct with 128k context length (https://huggingface.co/ibm-granite/granite-3.3-2b-instruct).
+
+(4) LoRA adapters: rank=64 applied to the query, value projection matrices
+
+**Training Data:** 
+
+Overall, our training data is largely comprised of two key sources: (1) publicly available datasets (2) Synthetic data created from publicly
 available datasets specifically targeting the speech translation task. A detailed description of the training datasets can be found in the table
-below:</p>
-<div class="max-w-full overflow-auto">
-	<table>
-		<thead><tr>
-<th>Name</th>
-<th>Task</th>
-<th>Nb. hours</th>
-<th>Source</th>
-</tr>
-
-		</thead><tbody><tr>
-<td>CommonVoice-17 English</td>
-<td>ASR</td>
-<td>2600</td>
-<td><a rel="nofollow" href="https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0">https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0</a></td>
-</tr>
-<tr>
-<td>MLS English</td>
-<td>ASR</td>
-<td>44000</td>
-<td><a rel="nofollow" href="https://huggingface.co/datasets/facebook/multilingual_librispeech">https://huggingface.co/datasets/facebook/multilingual_librispeech</a></td>
-</tr>
-<tr>
-<td>Librispeech</td>
-<td>ASR</td>
-<td>1000</td>
-<td><a rel="nofollow" href="https://huggingface.co/datasets/openslr/librispeech_asr">https://huggingface.co/datasets/openslr/librispeech_asr</a></td>
-</tr>
-<tr>
-<td>VoxPopuli English</td>
-<td>ASR</td>
-<td>500</td>
-<td><a rel="nofollow" href="https://huggingface.co/datasets/facebook/voxpopuli">https://huggingface.co/datasets/facebook/voxpopuli</a></td>
-</tr>
-<tr>
-<td>AMI</td>
-<td>ASR</td>
-<td>100</td>
-<td><a rel="nofollow" href="https://huggingface.co/datasets/edinburghcstr/ami">https://huggingface.co/datasets/edinburghcstr/ami</a></td>
-</tr>
-<tr>
-<td>YODAS English</td>
-<td>ASR</td>
-<td>10000</td>
-<td><a rel="nofollow" href="https://huggingface.co/datasets/espnet/yodas">https://huggingface.co/datasets/espnet/yodas</a></td>
-</tr>
-<tr>
-<td>Switchboard English</td>
-<td>ASR</td>
-<td>260</td>
-<td><a rel="nofollow" href="https://catalog.ldc.upenn.edu/LDC97S62">https://catalog.ldc.upenn.edu/LDC97S62</a></td>
-</tr>
-<tr>
-<td>CallHome English</td>
-<td>ASR</td>
-<td>18</td>
-<td><a rel="nofollow" href="https://catalog.ldc.upenn.edu/LDC97T14">https://catalog.ldc.upenn.edu/LDC97T14</a></td>
-</tr>
-<tr>
-<td>Fisher</td>
-<td>ASR</td>
-<td>2000</td>
-<td><a rel="nofollow" href="https://catalog.ldc.upenn.edu/LDC2004S13">https://catalog.ldc.upenn.edu/LDC2004S13</a></td>
-</tr>
-<tr>
-<td>Voicemail part I</td>
-<td>ASR</td>
-<td>40</td>
-<td><a rel="nofollow" href="https://catalog.ldc.upenn.edu/LDC98S77">https://catalog.ldc.upenn.edu/LDC98S77</a></td>
-</tr>
-<tr>
-<td>Voicemail part II</td>
-<td>ASR</td>
-<td>40</td>
-<td><a rel="nofollow" href="https://catalog.ldc.upenn.edu/LDC2002S35">https://catalog.ldc.upenn.edu/LDC2002S35</a></td>
-</tr>
-<tr>
-<td>CommonVoice-17 En-&gt;De,Es,Fr,It,Ja,Pt,Zh</td>
-<td>AST</td>
-<td>2600*7</td>
-<td>Translations with Phi-4 and MADLAD</td>
-</tr>
-</tbody>
-	</table>
-</div>
-<p><strong>Infrastructure:</strong>
+below:
+
+| Name | Task | Nb. hours | Source |
+|-----------|--------------|----------------|--------------|
+| CommonVoice-17 English  | ASR | 2600 |   https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0 |
+| MLS English             | ASR | 44000 |   https://huggingface.co/datasets/facebook/multilingual_librispeech |
+| Librispeech             | ASR | 1000 |  https://huggingface.co/datasets/openslr/librispeech_asr | 
+| VoxPopuli English       | ASR | 500 |  https://huggingface.co/datasets/facebook/voxpopuli | 
+| AMI                     | ASR | 100 | https://huggingface.co/datasets/edinburghcstr/ami | 
+| YODAS English           | ASR | 10000 |  https://huggingface.co/datasets/espnet/yodas | 
+| Switchboard English     | ASR | 260 | https://catalog.ldc.upenn.edu/LDC97S62 |
+| CallHome English        | ASR | 18  | https://catalog.ldc.upenn.edu/LDC97T14 | 
+| Fisher                  | ASR | 2000 | https://catalog.ldc.upenn.edu/LDC2004S13 | 
+| Voicemail part I        | ASR | 40 | https://catalog.ldc.upenn.edu/LDC98S77 | 
+| Voicemail part II       | ASR | 40 | https://catalog.ldc.upenn.edu/LDC2002S35 | 
+| CommonVoice-17 En->De,Es,Fr,It,Ja,Pt,Zh | AST | 2600*7 | Translations with Phi-4 and MADLAD |
+
+**Infrastructure:**
 We train Granite Speech using IBM's super computing cluster, Blue Vela, which is outfitted with NVIDIA H100 GPUs. This cluster provides a scalable
 and efficient infrastructure for training our models over thousands of GPUs. The training of this particular model was completed in 9 days on 32
-H100 GPUs.</p>
-<p><strong>Ethical Considerations and Limitations:</strong></p>
-<p>Users should be aware that the model may produce unreliable outputs when decoding with <code>num_beams=1</code> or when processing extremely short audio clips (&lt;0.1s). 
-Until further updates are released, we recommend using beam sizes greater than 1 and avoiding inputs below the 0.1-second threshold to ensure more consistent performance.</p>
-<p>The use of Large Speech and Language Models may involve risks and ethical considerations that people should be aware of. These risks may include bias and fairness, misinformation, and autonomous decision-making. We urge the community to use granite-speech-3.3-8b in a manner consistent with IBM's Responsible Use Guide or similar responsible use structures. IBM recommends using this model for automatic speech recognition tasks. The model's modular design improves safety by limiting how audio inputs can influence the system. If an unfamiliar or malformed prompt is received, the model simply echoes it with its transcription. This minimizes the risk of adversarial inputs, unlike integrated models that directly interpret audio and may be more exposed to such attacks. Note that more general speech tasks may pose higher inherent risks of triggering unwanted outputs. </p>
-<p>To enhance safety, we recommend using granite-speech-3.3-8b alongside Granite Guardian. Granite Guardian is a fine-tuned instruct model designed to detect and flag risks in prompts and responses across key dimensions outlined in the IBM AI Risk Atlas. Its training, which includes both human-annotated and synthetic data informed by internal red-teaming, enables it to outperform similar open-source models on standard benchmarks, providing an additional layer of safety.</p>
-<p><strong>Resources</strong></p>
-<ul>
-<li>⭐️ Learn about the latest updates with Granite: <a rel="nofollow" href="https://www.ibm.com/granite">https://www.ibm.com/granite</a></li>
-<li>🚀 Get started with tutorials, best practices, and prompt engineering advice: <a rel="nofollow" href="https://www.ibm.com/granite/docs/">https://www.ibm.com/granite/docs/</a></li>
-<li>💡 Learn about the latest Granite learning resources: <a rel="nofollow" href="https://ibm.biz/granite-learning-resources">https://ibm.biz/granite-learning-resources</a></li>
-</ul>
-<!-- HTML_TAG_END --></div>
-</div></section></div></main>
-
-	</div>
-
-		<script>
-			import("\/front\/build\/kube-19a386b\/index.js");
-			window.moonSha = "kube-19a386b\/";
-			window.__hf_deferred = {};
-		</script>
-
-		<!-- Stripe -->
-		<script>
-			if (["hf.co", "huggingface.co"].includes(window.location.hostname)) {
-				const script = document.createElement("script");
-				script.src = "https://js.stripe.com/v3/";
-				script.async = true;
-				document.head.appendChild(script);
-			}
-		</script>
-	</body>
-</html>
+H100 GPUs.
+
+**Ethical Considerations and Limitations:**
+ 
+Users should be aware that the model may produce unreliable outputs when decoding with ```num_beams=1``` or when processing extremely short audio clips (<0.1s). 
+Until further updates are released, we recommend using beam sizes greater than 1 and avoiding inputs below the 0.1-second threshold to ensure more consistent performance.
+
+The use of Large Speech and Language Models may involve risks and ethical considerations that people should be aware of. These risks may include bias and fairness, misinformation, and autonomous decision-making. We urge the community to use granite-speech-3.3-2b in a manner consistent with IBM's Responsible Use Guide or similar responsible use structures. IBM recommends using this model for automatic speech recognition tasks. The model's modular design improves safety by limiting how audio inputs can influence the system. If an unfamiliar or malformed prompt is received, the model simply echoes it with its transcription. This minimizes the risk of adversarial inputs, unlike integrated models that directly interpret audio and may be more exposed to such attacks. Note that more general speech tasks may pose higher inherent risks of triggering unwanted outputs. 
+
+To enhance safety, we recommend using granite-speech-3.3-2b alongside Granite Guardian. Granite Guardian is a fine-tuned instruct model designed to detect and flag risks in prompts and responses across key dimensions outlined in the IBM AI Risk Atlas. Its training, which includes both human-annotated and synthetic data informed by internal red-teaming, enables it to outperform similar open-source models on standard benchmarks, providing an additional layer of safety.
+
+**Resources**
+- ⭐️ Learn about the latest updates with Granite: https://www.ibm.com/granite
+- 🚀 Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/
+- 💡 Learn about the latest Granite learning resources: https://ibm.biz/granite-learning-resources<!doctype html>
\ No newline at end of file

Configuration parameter	Value
Input dimension	160 (80 logmels x 2)
Nb. of layers	10
Hidden dimension	1024
Nb. of attention heads	8
Attention head size	128
Convolution kernel size	15
Output dimension	42
Name	Task	Nb. hours	Source
CommonVoice-17 English	ASR	2600	https://huggingface.co/datasets/mozilla-foundation/common_voice_17_0
MLS English	ASR	44000	https://huggingface.co/datasets/facebook/multilingual_librispeech
Librispeech	ASR	1000	https://huggingface.co/datasets/openslr/librispeech_asr
VoxPopuli English	ASR	500	https://huggingface.co/datasets/facebook/voxpopuli
AMI	ASR	100	https://huggingface.co/datasets/edinburghcstr/ami
YODAS English	ASR	10000	https://huggingface.co/datasets/espnet/yodas
Switchboard English	ASR	260	https://catalog.ldc.upenn.edu/LDC97S62
CallHome English	ASR	18	https://catalog.ldc.upenn.edu/LDC97T14
Fisher	ASR	2000	https://catalog.ldc.upenn.edu/LDC2004S13
Voicemail part I	ASR	40	https://catalog.ldc.upenn.edu/LDC98S77
Voicemail part II	ASR	40	https://catalog.ldc.upenn.edu/LDC2002S35
CommonVoice-17 En->De,Es,Fr,It,Ja,Pt,Zh	AST	2600*7	Translations with Phi-4 and MADLAD