Windows | Chaladi's Blog

Category: Windows

Solving SCOM false alerts for deleted cluster VM resources

Filed under: Azure Pack, Cluster, Failover Clustering, SCOM, SCVMM, System Center, Windows, Windows Server 2012, Windows Server 2012 R2 — Leave a comment

August 22, 2018

Hello there! As the title hints, this post is about solving false alerts being generated in SCOM for non-existing clustered VMs / resources.

I have recently come across a situation where in SCOM I see lot of false alerts generated for hyper-v 2012 r2 clustered resources; reporting VM resource groups are in critical state. However the VMs are deleted from cluster, so the cluster resource monitoring MP should monitor only what actual resources exist on the cluster. “Alert monitor” generated alerts for deleted cluster resources must be closed manually in SCOM as the monitor keeps checking about the non-existing resource to see it state change update., and even after doing this, the alerts for deleted VMs keeps coming back in console.

The whole problem started when couple of cluster nodes in hyper-v cluster are set in Maintenance mode for some activity and the hosts were shutdown as part of process, and during this period, couple of VMs were deleted using VMM management server, and those VMs were gone from cluster as expected, however SCOM picked data from online cluster nodes and was not able to pick data about the deleted VMs from offline cluster nodes. When the shutdown hyper-v hosts were brought online, SCOM started behaving weird, it is still thinking deleted VMs are with the shutdown hyper-v hosts and generated lot of false alerts for deleted VMs in SCOM reporting VMs are in critical state.

At this point, the data with the SCOM in its database is inconsistent. There is no way to remove a clustered resource from cluster management pack view dashboard. We can only place the resource group in MM.

To solve this bug / data inconsistent behavior with SCOM, cluster monitoring Management packs must be deleted and we have to import the cluster monitoring management packs again – this needs to be done when all cluster node are brought back online and active in the cluster, so MPs can pick the data from all Hyper-v nodes.

Any custom management packs created depending on the cluster MPs needs to be exported from Administration view of SCOM and after deleting all Cluster MPs, and re-importing MPs back in SCOM along with custom MPs will fix this issue. It will take about 1 hour or more to pick / update status of clusters.

Tags: deleted cluster vms showing in scom, false alerts in scom, false cluster resource group offline alerts in scom, SCOM, scom 2012r2, scom data inconsistent, scom generating false alerts, scom management packs, scom not showing proper data, stale cluster resource in scom, stale datain scom, system center operations manager, vmm and scom integration

Comment

Solution – Azure Site Recovery – Test Failover “Not found” issue

Filed under: Azure Pack, Azure Site Recovery, Azure Stack, Windows, Windows Server 2012 R2, Windows Server 2016 — Leave a comment

July 25, 2018

Test failover function in Azure portal isn’t working as expected – this is a known issue and Microsoft is actively working towards fixing the bug.

Until then you may use the following link as a workaround https://aka.ms/e2e-ie-tempfix

Using workaround link:

Tags: ASR, asr failover not found error fix, asr not found, asr test fail over, asr test failover not found issue, azure portal DR test not working, Azure Site Recovery, Azure site recovery test failover not found error, DR test not found error, fix ASR test failover not found error, how to resolve ASR test failover not found error, not found error in azure portal, portal.azure.com

Comment

Some common Azure Stack installation errors and how to deal with them

Filed under: Azure Pack, Azure Stack, Installation Issues, Windows, Windows Server 2016 — Leave a comment

November 30, 2017

As the title says, this post will list out the common installation errors we might encounter when deploying Azure Stack for POC purposes.

Time Zone issue: When deploying the Azure stack, the first run of powershell script will ask you to input Computer Name, IP, DNS and Time zone configurations. In case if the time zone settings are left as is / configured incorrectly – means Azure stack host Operating system Time zone is different, then Post DC VM installation, the script will fail because the time sync will have problems and authentication of Azure stack host with the DC to join it to the deployed Azure stack DC will have issues. – Authentication related error will throw.

Access Denied on Stack VMs from Host powershell session: This is because when the VMs are deployed using powershell and are rebooted as part of configurations, there are chances that required Ops-Administrators Group doesn’t get added to the VMs. In this case we will have to login/remote to the concerned VM and validate the local administrators group.

How to add Ops-Admins to the Local Admins group is listed below:

The reason it fails to access the VMs remotely from the powershell session we do -rerun command is that we login to the Azure stack host as FabricAdministrator and this same account doesn’t have access to the VMs because of some skipped/misconfiguration.

VM Installation failed with the unexpected restart error: There can be multiple reasons why a VM fail at this step. Primarily it hints about underlying hardware specs and mounted ISO image integrity as well – since Azure stack powershell deploys VM using VHD it has pre-built, mostly due to underlying storage IOPS issues, the VM might fail to boot within time/ crash during installation. If you see any VM failed to boot and Script fails to continue as it needs VM be up and remotely accessible via PS, then You can delete the VM from FCM / Hyper-V and -rerun the stack deployment.

Other issues we might commonly encounter is due to Resources availability and Time out due to VMs failing to respond / boot.

Main limitation in deploying Azure Stack POC is mostly due to Storage. As all the VMs run off of same Disk, available throughput, IOPS for all the VMs considerably falls low. Spreading disks on multiple HDDs for S2D pool will increase the performance. Another feasible option is to use SSHDs – if budget allows :).

In Azure stack POC publicly released, Nuget Packages are little modified. Cloud deployment nuget package now sectioned configuration of VMs individually, that means we have to define the Memory, CPU and Dynamic Memory configs in the OneNodeRole.xml configuration file of each individual VM roles.

As we can see from the above config file, Memory, CPU and Dynamic Memory settings must be defined for each Azure Stack VM individually in the “CloudDeployment.1.0.597.18.nupkg” file. The exact path to access the individual VMs to configure can be seen from the Rar file screenshot above – “\Content\Configuration\Roles\Fabric\*” If this needs a detailed post, please send your requests, so as per requirements I will write a new blog post to deal with this.

Tags: Access denied azure stack, Azure Stack, azurestack common errors, Azurestack POC, how to fix timezone issue in Azurestack, how to fix vm failing to boot azure stack VMs, microsoft azure stack installation errors, private cloud, VM installation interrupted azure stack, windows azure

Comment

How to fix PC/Laptop taking too much time to shutdown

Filed under: Performance, Windows, Windows 10, Windows 7, Windows 8, Windows Server, Windows server 2008 R2, Windows Server 2012, Windows Server 2012 R2, Windows Vista, Windows XP — Leave a comment

October 22, 2017

If your PC / laptop is taking ages to shut down/reboot, there are multiple areas that we need to focus on in checking to identify the root cause and fix it permanently. In this post I am going to show you the common areas to look at when you are in such situations.

First thing to start with is to fire up the Task manager and identify the resource utilization from the moment you initiate shutdown/reboot. Speaking in context with Windows 8 and 10 versions, launching Task manager will land you at Processes tab by default unless you chose fewer details; if so, click on more details and this will land you at Processes tab. Here key areas to look at is CPU, Memory and Disk processes.

Active programs accessed during your session are generally accessed using Physical memory (RAM) and the other passive/minimized programs will be moved off to page file based on RAM utilization and availability. So, in our case, active programs data we accessed needs to be written off to Disk to commit the tasks we performed just before hitting that shutdown/reboot buttons/commands.

CPU, Memory & Disk resource utilization are heavily dependent on the programs those you accessed during your session and background programs those run as part of OS/software requirements. Considering an example, I have launched VMware workstation program which consumed about 56GB of my RAM for its operations. Closing the program will not immediately free up used 56 GB of RAM, because the program itself has child processes that needs memory accessed data be written off/committed to the Disk to save the programs state I have left it at.

See below screenshot for your reference:

This is Resource monitor tool (type “resmon” in command prompt to fire this thing up), from this utility you can further check the resource utilizing processes. Take a look at the screenshot above, though the memory usage has come down to 4%, the disk still has read write operations going on it. Sort with “Total (B/sec)” in descending order to see which process is performing operations on disk. In my case VMware workstation uses .vmem files to hold the physical memory of my machine to give those physical RAM resources to Virtual machines I use within the application. Once I shutdown/suspend those VMs running on VMware workstation application, the RAM utilized by the VMs must be saved to disk on .vmem file. This process takes time based on amount of RAM utilized by each VM – the larger VM memory configuration/utilized, the longer time it takes to commit the data off to disk.

Similarly, there are multiple child processes/background tasks that runs in the back-end and until those tasks are completed, the system will not shutdown/reboot. These tasks do not appear on “Shutdown preventing programs” because they are actively working to close the session data. There are a lot of other tools we can use to identify the processes those consuming resources, but the first step to start off is with resource monitor.

I will keep adding more information on this topic, but if you have any queries feel free to comment and I will try to address them.

Cheers!

Chaladi

Tags: disk iops, disk performance, disk utilization high after shutdown command, vmem process utilising disk iops, vms crashing after host os reboot, vmware workstation hangs at reboot, vmware workstation shutdown problesm, Windows 10, windows 10 hangs at shutdown, windows 10 shutdown taking longer, windows 8 reboot taking longer

Comment

How to Fix – Unable to import VHDX/files on 2012/R2 library server – 24401 error

Filed under: SCVMM, SQL, System Center, Windows, Windows Server 2012, Windows Server 2012 R2 — Leave a comment

October 21, 2017

Sometimes importing VHDX/files into Library server or scanning the Library server share files fails with “Unable to import xxxx. xxxx files can only be imported by library servers running Windows server 2012 or later” Error log looks like below in the SCVMM Jobs.

This issue happens when VMM Library server information in VMM Database is improper.

Run below sql query against VMM Database to see the Library server OS information.

SELECT * FROM [dbo].[tbl_ADHC_Library]

If above query displays concerned VMM Library server OperatingSystemVersion as “0.0.0.0” then the information is corrupt and this needs to be fixed. Below query displays ”0.0.0.0” info for hyd-sql-01 vmm library, so this must be updated to fix the library issue.

SQL Query

Next, update the table ADHC_Library with appropriate Operating System version of your Windows server. You can get the OS version of the server using below command in cmd prompt.

systeminfo | findstr OS

For example, my server is 2012 R2, so I have updated table with “6.3.9200” as below. This will fix the library import issues.

Run below sql query against ADHC_Library table of VMM database to update the OS version

update tbl_ADHC_Library set OperatingSystemVersion = '6.3.9200' where OperatingSystemVersion like '0.0.0.0'

Update SQL Table

This should now help fix the import issues. VHDX/files can now been seen in library shares we’ve configured.

Tags: error 24401, failed to import files into library server, how to check library server operating system version in vmm database, import vhdx failed in scvmm, library server refresh not working, scvmm library error, scvmm library server update failed, server 2012 r2 library server not able to import vhdx files, sql query for library server operating system version, unable to add files into library server, unable to import resources into library server, windows server 2012 r2 library server

Comment

How to Fix SCVMM 2012/R2 failing with Webdeploy -1603 error

Filed under: SCVMM, System Center, Windows, Windows Server, Windows Server 2012, Windows Server 2012 R2 — 2 Comments

When attempting to install System Center Virtual Machine Manger 2012/R2, you might get installation failure due to Webdeploy.msi failed with Windows Installer 1603 error code as below:

Looking at the logs, you can see webdeploy is failing to install on the system. The reason it reports is that 3.5 already exist if you take a closer look at the logs. Take a look at the logs snippet below:

As highlighted above, it reports A newer version of Web Deploy was found on this machine. To resolve the installation issue, you must navigate to Programs and Features and uninstall the Web Deploy from there. Once this is uninstalled, you can retry the SCVMM installation and it must pass the web deploy issue now.

The reason why a newer version exists can be related to either SCVMM 2016 has been previously attempted to install on this system, or part of web deploy components have been installed for other application requirements.

Tags: downgrade scvmm 2016 to 2012, microsoft web deploy 3.5, reverting to scvmm 2012 from 2016, SCVMM 2012, scvmm install failed with 1603 error, system center virtual machine manager, web deploy 1603 error, web deploy install failed, webdeploy.msi failed with 1603

Comment

Configure File/Windows Explorer to Launch “This PC” instead of “Quick access”

Filed under: How To, Windows, Windows 10, Windows 8 — Leave a comment

In Windows 8 / 10 versions, launching the File explorer places you in Quick access window instead of legacy “My Computer/This PC”. This post will help you to configure the File explorer to open up “This PC” instead of quick access, so your requirement is fulfilled.

Launch File explorer which land you into quick access window as below:

Quick Access

Right click on the “Quick access” as below to launch the Options menu.

Settings

Change the Option “Open File Explorer to:” to This PC from drop down menu. Click on Apply and hit OK. This will configure the file explorer to launch This PC instead of Quick access.

Tags: change windows explorer to this pc, configure file explorer, configure quick access, default to this pc, how to change quick access, how to land onto this pc instead of quick access, how to modify quick access, quick access bypass, remove quick access, windows 10 quick access bypass, windows 8 quick access

Comment

Understanding Importance of Machine SID

Filed under: active directory, Windows, Windows 7, Windows 8, Windows Server, Windows Server 2008, Windows server 2008 R2, Windows Server 2012, Windows Server 2012 R2, Windows Vista, Windows XP — Leave a comment

November 5, 2016

Some of the IT people overlook this SID attribute of a machine forgetting the importance of unique SID/GUID requirements.

1. Try creating a clone of VM in Hyper-V or VMware Workstation and have them in Workgroup and see if you can enable communication between two clones
2. Try join the same clone VMs into Lab based domain and see how it goes
3. With domain user accounts added in the VM’s Lusrmgr.msc, post AD join and logging into VM with one of AD account and then demoting VM from domain, and then try to do a sysprep with domain accounts still there in local user accounts, and see if you can run sysprep successfully

Please do this Labwork and comment your results…

Just a sneak peak at the SID error…

Tags: 2008, 2008r2, 2012r2, cannot run sysprerp, domain join failed with SID, GUID error, how to run sysprerp, how to solve SID error, local accounts in vm clone, machine SID, panther file in sysprep, SID, SID error, sysprep error, virtual machines, vm clones, vm clones cannot communicate, Windows Server 2012

Comment

[Solved] How to Fix DHCP server Authorization failed with Error Code: 20070

Filed under: DHCP, How To, IT Info, Windows, Windows Server, Windows Server 2008, Windows server 2008 R2, Windows Server 2012, Windows Server 2012 R2 — 19 Comments

We’re gonna solve the DHCP server authorization issue in this post. Error code looks like below:

“The authorization of DHCP Server failed with Error Code: 20070. The DHCP service couldn’t contact Active Directory.”

This is possibly due to user permissions on AD. Ensure you input Domain Administrator (DA) Credentials in the DHCP Commit dialog box, instead of proceeding with logged in account. There are chances that though you logged into DC using some user credentials, it doesn’t necessarily mean you are DA/EA. It could just be an account Admin locally, but not on Domain/forest. Check the DA user in ADUC and ensure you input those credentials to solve this.

Other things you should try if the credentials are DA is, ensure AD services are up and running. Check launching ADUC, Try Restarting DHCP Server services, Try re-installing DHCP from server manager. If you still encounter any issues, please message here, so we can further look into it to get it resolved.

Cheers!
Chaladi

Comment

#Microsoft – Limitations in Windows Server Failover Clustering

Filed under: Cluster, Failover Clustering, Windows, Windows Server, Windows Server 2012, Windows Server 2012 R2 — Leave a comment

February 22, 2016

Howdy! Today’s blog post is all about Microsoft’s Windows Server Failover Clustering. I’ve noticed that there are a couple of limitations in Windows Server Failover Clustering (WSFC). I am gonna keep adding the identified limitations, so keep checking.

First of all Shared VHDX issue. Shared VHDX is a clustering storage feature introduced in 2012R2 for windows server cluster participating nodes. If you’re wondering what is Shared VHDX and how it works, Please see here

So, now say, in a 2 node Shared VHDX cluster, you attach 4 Disks to the cluster resource, that is SQL considered as an example here, and both the cluster nodes have these 4 disks in Shared VHDX mode. This will help present the storage as Shared storage, so both the nodes in the cluster can see them. Now if I wanted to move the SQL form Node A to Node B, then all the Shared VHDXs on SQL owning Node will go to Reserve state and will come online on Node B, since we have moved the SQL to node B; and eventually SQL associated Disks and components will move.

Now, if for some reason, one of shared disks are not presented to Node B via Hyper-V manager settings, then Failing over the SQL to Node B will fail to move to Node B. The only error you get is “Cluster disk not connected”. And generating the cluster logs via powershell using “get-clusterlog -uselocatime -timespan 5 -destination D:\logs” too results the below logs.

“ERR [RCM] rcm::RcmApi::MoveGroup: ERROR_CLUSTER_DISK_NOT_CONNECTED(5963)’ because of ‘Move of group SQL Server (MSSQLSERVER) to node CLUSTERNODE2 is not approved’”

Now the limitation I’m talking about here is, the cluster is not helping out you identify which exact Shared VHDX is not visible to the Node B. So, if Disk 2 is not presented to Node B, then the cluster knows in the background that it is failing to bring the cluster Disk 2 on Node B, so it should log all that “Bringing Disk 1 online on Node X — Pass, Bringing Disk 2 Online on Node X…” like so, that will help you identify the missing Shared VHDX on the Nodes.

In the above command I’ve used timespan of 5 minutes to pull logs regarding cluster. This avoid me generating a big file and to read all the unwanted stuff, since I’ve just tried to move the SQL off of Node A within the last 5 minutes.

Now, you may feel that you can use Disk management to see the Disks differences, but it works if you have few disks and they all represent different data sizes. If you have 15 or like Storage disks presented via Hyper-V and then almost all are same size, like 500 GB in sizes, then it would be kinda time waste to go through all those disk numbers comparing the disks on each nodes side by side.

Now, when I say limitation in Shared VHDX perspective, it could also apply to SAN storage presented via EMC powerpath or like that to the Cluster Nodes. But in that SAN storage directly presented to Cluster node, we can use Powerpath console to identify the disks missing using the reference naming convention used to label disks pushed to the cluster nodes while zoning. But it is still I feel a limitation exists in Windows clustering that is much needed to address at earliest.

And here, with Shared VHDX there’s a big issue with the Redirected I/Os that will kill your critical applications because of poor disk performance. A heavy disk utilising cluster resource must not use Shared VHDx as its storage for this reason. I will write more about this Redirected IOs issue in a separate post.

Tags: cluster debug logs, failover, fcm, how to generate cluster logs, hyper v, hyper v clustering, Hyper-v cluster, limitations in windows server clustering, move failure in windows lcuster, san storage issue, shared VHDX, storage issue, storage not connected issue, windows server failover clustering, wsfc

Comment

Like My Work? Say it on FB

Like My Work? Say it on FB
Email Subscription

Enter your email address to subscribe to this blog and receive notifications of new posts by email.

Email Address:

Join 300 other subscribers

Calendar
May 2024

M T W T F S S

1 2 3 4 5

6 7 8 9 10 11 12

13 14 15 16 17 18 19

20 21 22 23 24 25 26

27 28 29 30 31

« Aug
Top Posts & Pages
Cloud
2012 2012r2 active directory alias android Apple bridged mode for vm centos centos in vmware clear process to install centos Cnames Computer file copy viber messages detailed process to install centos DNS eset smart security facebook File locking File Management Filename extension gallery lock pro general settings for vmware guest o.s host name Hostname Hosts hosts file how to how to backup viber messages how to do linux in vmware how to install centos how to restore viber messages in android how to save viber messages how to save viber messages in android Hyper-v cluster ics Ipaddress IPhone ITunes kernel problems Linux linux sestup in vmware Microsoft Microsoft Windows network settings Nsswitch.conf Operating system permanent hostname in centos Personal computer ping Protocols Redhat rooting android save viber messages saving viber contacts setting up hostname setup hostname Shareware step by step to install centos steps to install redhat titanium backu titanium backup Tools Utilities viber virtualisation virtual machines VM vmware vmware usage VMware Workstation Windows windows 7 Windows 8 windows vista
Archives
Archives
Pages
- About Me
- Android
- Linux
- My Life Blog
- Routing & Switching
- Security
- Virtualisation
- Windows
- Hacking

	Sheraz turk on How to Solve: Adding Identity…
	Eli Graham on [Solved/Fixed] How to Fix Soft…
	FarooQ khan on [Solved] How to Recover Origin…
	chaladi on There’s this weird TCP r…
	Xof on There’s this weird TCP r…

Chaladi's Blog

RSS

Category: Windows

Solving SCOM false alerts for deleted cluster VM resources

Solution – Azure Site Recovery – Test Failover “Not found” issue

Some common Azure Stack installation errors and how to deal with them

How to fix PC/Laptop taking too much time to shutdown

How to Fix – Unable to import VHDX/files on 2012/R2 library server – 24401 error

How to Fix SCVMM 2012/R2 failing with Webdeploy -1603 error

Configure File/Windows Explorer to Launch “This PC” instead of “Quick access”

Understanding Importance of Machine SID

[Solved] How to Fix DHCP server Authorization failed with Error Code: 20070

#Microsoft – Limitations in Windows Server Failover Clustering

Like My Work? Say it on FB

Email Subscription

Recent Comments

Calendar

Top Posts & Pages

Cloud

Archives

Pages

Friends & links

Pages

Monthly archives

RSS

Category: Windows

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Share this:

Email Subscription

Recent Comments

Calendar

Top Posts & Pages

Cloud

Archives

Pages

Friends & links

Pages

Monthly archives